ANALYZE ONLY
Hello Postgres Hackers
An application that I am developing uses Postgresql, and includes a fairly large
number of partitioned tables which are used to store time series data.
The tables are partitioned by time, and typically there is only one partition
at a time - the current partition - that is actually being updated.
Older partitions
are available for query and eventually dropped.
As per the documentation, partitioned tables are not analyzed by the autovacuum
workers, although their partitions are. Statistics are needed on the partitioned
table level for at least some query planning activities.
The problem is that giving an ANALYZE command targeting a partitioned table
causes it to update statistics for the partitioned table AND all the individual
partitions. There is currently no option to prevent it from including the
partitions.
This is wasteful for our application: for one thing the autovacuum
has already analyzed the individual partitions; for another most of
the partitions
will have had no changes, so they don't need to be analyzed repeatedly.
I took some measurements when running ANALYZE on one of our tables. It
took approx
4 minutes to analyze the partitioned table, then 29 minutes to analyze the
partitions. We have hundreds of these tables, so the cost is very significant.
For my use case at least it would be fantastic if we could add an ONLY option
to ANALYZE, which would cause it to analyze the named table only and not
descend into the partitions.
I took a look at the source and it looks doable, but before launching into it
I thought I would ask a few questions here.
1. Would such a feature be welcomed? Are there any traps I might not
have thought of?
2. The existing ANALYZE command has the following structure:
ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name?
An alternative would be:
ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]
Any feedback or advice would be great.
Regards
Mike.
On Tue, 20 Aug 2024 at 07:52, Michael Harris <harmic@gmail.com> wrote:
1. Would such a feature be welcomed? Are there any traps I might not
have thought of?
I think this sounds like a reasonable feature.
2. The existing ANALYZE command has the following structure:
ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name?
An alternative would be:ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]
Any feedback or advice would be great.
Personally I'd prefer a new option to be added. But I agree ONLY isn't
a good name then. Maybe something like SKIP_PARTITIONS.
On 20.8.24 10:42, Jelte Fennema-Nio wrote:
On Tue, 20 Aug 2024 at 07:52, Michael Harris<harmic@gmail.com> wrote:
1. Would such a feature be welcomed? Are there any traps I might not
have thought of?I think this sounds like a reasonable feature.
2. The existing ANALYZE command has the following structure:
ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name?
An alternative would be:ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]
Any feedback or advice would be great.
Personally I'd prefer a new option to be added. But I agree ONLY isn't
a good name then. Maybe something like SKIP_PARTITIONS.
Hi everyone,
Your proposal is indeed interesting, but I have a question: can't your
issue be resolved by properly configuring |autovacuum| instead of
developing a new feature for |ANALYZE|?
From my perspective, |ANALYZE| is intended to forcibly gather
statistics from all partitioned tables. If the goal is to ensure that
statistics are updated at the right moment, adjusting the
|autovacuum_analyze_threshold| and |autovacuum_analyze_scale_factor|
parameters might be the solution.
--
Regards,
Ilia Evdokimov,
Tantor Labs LCC.
On Tue, 20 Aug 2024 at 23:25, Ilia Evdokimov
<ilya.evdokimov@tantorlabs.com> wrote:
Your proposal is indeed interesting, but I have a question: can't your issue be resolved by properly configuring autovacuum instead of developing a new feature for ANALYZE?
Basically, no. There's a "tip" in [1]https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-STATISTICS which provides information on
the limitation, namely:
"The autovacuum daemon does not issue ANALYZE commands for partitioned
tables. Inheritance parents will only be analyzed if the parent itself
is changed - changes to child tables do not trigger autoanalyze on the
parent table. If your queries require statistics on parent tables for
proper planning, it is necessary to periodically run a manual ANALYZE
on those tables to keep the statistics up to date."
There is also some discussion about removing the limitation in [2]/messages/by-id/CAKkQ508_PwVgwJyBY=0Lmkz90j8CmWNPUxgHvCUwGhMrouz6UA@mail.gmail.com.
While I agree that it would be nice to have autovacuum handle this,
it's not clear how exactly it would work. Additionally, if we had
that, it would still be useful if the ANALYZE command could be
instructed to just gather statistics for the partitioned table only.
David
[1]: https://www.postgresql.org/docs/devel/routine-vacuuming.html#VACUUM-FOR-STATISTICS
[2]: /messages/by-id/CAKkQ508_PwVgwJyBY=0Lmkz90j8CmWNPUxgHvCUwGhMrouz6UA@mail.gmail.com
Hi Michael,
Thanks for starting this thread. I've also spent a bit time on this after
reading your first thread on this issue [1]
Michael Harris <harmic@gmail.com>, 20 Ağu 2024 Sal, 08:52 tarihinde şunu
yazdı:
The problem is that giving an ANALYZE command targeting a partitioned table
causes it to update statistics for the partitioned table AND all the
individual
partitions. There is currently no option to prevent it from including the
partitions.This is wasteful for our application: for one thing the autovacuum
has already analyzed the individual partitions; for another most of
the partitions
will have had no changes, so they don't need to be analyzed repeatedly.
I agree that it's a waste to analyze partitions when they're already
analyzed by autovacuum. It would be nice to have a way to run analyze only
on a partitioned table without its partitions.
I took some measurements when running ANALYZE on one of our tables. It
took approx
4 minutes to analyze the partitioned table, then 29 minutes to analyze the
partitions. We have hundreds of these tables, so the cost is very
significant.
I quickly tweaked the code a bit to exclude partitions when a partitioned
table is being analyzed. I can confirm that there is a significant gain
even on a simple case like a partitioned table with 10 partitions and 1M
rows in each partition.
1. Would such a feature be welcomed? Are there any traps I might not
have thought of?
2. The existing ANALYZE command has the following structure:
ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name
An alternative would be:ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]
I feel closer to adding this as an option instead of a new keyword in
ANALYZE grammar. To me, it would be easier to have this option and then
give the names of partitioned tables as opposed to typing ONLY before each
partition table.
But we should think of another name as ONLY is used differently (attached
to the table name as you mentioned) in other contexts.
I've been also thinking about how this new option should affect inheritance
tables. Should it have just no impact on them or only analyze the parent
table without taking child tables into account? There are two records for
an inheritance parent table in pg_statistic, one row for only the parent
table and a second row including children. We might only analyze the parent
table if this new "ONLY" option is specified. I'm not sure if that would be
something users would need or not, but I think this option should behave
similarly for both partitioned tables and inheritance tables.
If we decide to go with only partition tables and not care about
inheritance, then naming this option to SKIP_PARTITIONS as Jelte suggested
sounds fine. But that name wouldn't work if this option will affect
inheritance tables.
Thanks,
--
Melih Mutlu
Microsoft
Melih Mutlu <m.melihmutlu@gmail.com>, 20 Ağu 2024 Sal, 19:26 tarihinde şunu
yazdı:
Hi Michael,
Thanks for starting this thread. I've also spent a bit time on this after
reading your first thread on this issue [1]
Forgot to add the reference [1]/messages/by-id/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com
[1]: /messages/by-id/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com
/messages/by-id/CADofcAXVbD0yGp_EaC9chmzsOoSai3jcfBCnyva3j0RRdRvMVA@mail.gmail.com
On Tue, Aug 20, 2024 at 1:52 AM Michael Harris <harmic@gmail.com> wrote:
2. The existing ANALYZE command has the following structure:
ANALYZE [ ( option [, ...] ) ] [ table_and_columns [, ...] ]
It would be easiest to add ONLY as another option, but that
doesn't look quite
right to me - surely the ONLY should be attached to the table name?
An alternative would be:ANALYZE [ ( option [, ...] ) ] [ONLY] [ table_and_columns [, ...] ]
Any feedback or advice would be great.
I like trying to use ONLY somehow.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
I like trying to use ONLY somehow.
Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?
Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.
1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.
Whereas: "ANALYZE (ONLY) ptab, ptab2;" would always give you the
behaviour of #2.
If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:
postgres=# analyze (skip_locked) a;
WARNING: skipping analyze of "a" --- lock not available
David
David Rowley <dgrowleyml@gmail.com> writes:
On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
I like trying to use ONLY somehow.
Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?
Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.
1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.
FWIW, I think that's the right approach, for consistency with the
way that ONLY works in DML.
regards, tom lane
On Tue, Aug 20, 2024 at 6:53 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
I like trying to use ONLY somehow.
Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?
The table modifier idea seems nice to me.
If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:postgres=# analyze (skip_locked) a;
WARNING: skipping analyze of "a" --- lock not available
Perhaps. I'm not sure about this part.
--
Robert Haas
EDB: http://www.enterprisedb.com
David Rowley <dgrowleyml@gmail.com>, 21 Ağu 2024 Çar, 01:53 tarihinde şunu
yazdı:
On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
I like trying to use ONLY somehow.
Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.
I believe we should go this route if we want this to be called "ONLY" so
that it would be consistent with other places too.
Whereas: "ANALYZE (ONLY) ptab, ptab2;" would always give you the
behaviour of #2.
Havin it as an option would be easier to use when we have several partition
tables. But I agree that if we call it "ONLY ", it may be confusing and the
other approach would be appropriate.
If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:postgres=# analyze (skip_locked) a;
WARNING: skipping analyze of "a" --- lock not available
+1 to raising a warning message in that case instead of staying silent. We
might also not allow ONLY if ANALYZE is not present in VACUUM query and
raise an error. But that would require changes in grams.y and could
complicate things. So it may not be necessary and we may be fine with just
a warning.
Regards,
--
Melih Mutlu
Microsoft
Thank you all for the replies & discussion.
It sounds like more are in favour of using the ONLY syntax attached to
the tables is the best way to go, with the main advantages being:
- consistency with other commands
- flexibility in allowing to specify whether to include partitions
for individual tables when supplying a list of tables
I will start working on an implementation along those lines. It looks
like we can simply replace qualified_name with relation_expr in the
production for vacuum_relation within gram.y.
One other thing I noticed when reading the code. The function
expand_vacuum_rel in vacuum.c seems to be responsible for adding the
partitions. If I am reading it correctly, it only adds child tables in
the case of a partitioned table, not in the case of an inheritance
parent:
include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
..
if (include_parts)
{
.. add partitions ..
This is a little different to some other contexts where the ONLY
keyword is used, in that ONLY would be the default and only available
mode of operation for an inheritance parent.
Regards,
Mike
Show quoted text
On Wed, 21 Aug 2024 at 20:04, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
David Rowley <dgrowleyml@gmail.com>, 21 Ağu 2024 Çar, 01:53 tarihinde şunu yazdı:
On Wed, 21 Aug 2024 at 06:41, Robert Haas <robertmhaas@gmail.com> wrote:
I like trying to use ONLY somehow.
Do you mean as an ANALYZE command option, i.e. ANALYZE (only) table;
or as a table modifier like gram.y's extended_relation_expr?Making it a command option means that the option would apply to all
tables listed, whereas if it was more like an extended_relation_expr,
the option would be applied per table listed in the command.1. ANALYZE ONLY ptab, ptab2; -- gather stats on ptab but not on its
partitions but get stats on ptab2 and stats on its partitions too.
2. ANALYZE ONLY ptab, ONLY ptab2; -- gather stats on ptab and ptab2
without doing that on any of their partitions.I believe we should go this route if we want this to be called "ONLY" so that it would be consistent with other places too.
Whereas: "ANALYZE (ONLY) ptab, ptab2;" would always give you the
behaviour of #2.Havin it as an option would be easier to use when we have several partition tables. But I agree that if we call it "ONLY ", it may be confusing and the other approach would be appropriate.
If we did it as a per-table option, then we'd need to consider what
should happen if someone did: "VACUUM ONLY parttab;". Probably
silently doing nothing wouldn't be good. Maybe a warning, akin to
what's done in:postgres=# analyze (skip_locked) a;
WARNING: skipping analyze of "a" --- lock not available+1 to raising a warning message in that case instead of staying silent. We might also not allow ONLY if ANALYZE is not present in VACUUM query and raise an error. But that would require changes in grams.y and could complicate things. So it may not be necessary and we may be fine with just a warning.
Regards,
--
Melih Mutlu
Microsoft
On Thu, 22 Aug 2024 at 11:32, Michael Harris <harmic@gmail.com> wrote:
One other thing I noticed when reading the code. The function
expand_vacuum_rel in vacuum.c seems to be responsible for adding the
partitions. If I am reading it correctly, it only adds child tables in
the case of a partitioned table, not in the case of an inheritance
parent:include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
..
if (include_parts)
{
.. add partitions ..This is a little different to some other contexts where the ONLY
keyword is used, in that ONLY would be the default and only available
mode of operation for an inheritance parent.
That's inconvenient and quite long-established behaviour. I had a look
as far back as 9.2 and we only analyze parents there too. I'm keen on
the ONLY syntax, but it would be strange if ONLY did the same thing as
not using ONLY for inheritance parents.
I feel like we might need to either bite the bullet and make ONLY work
consistently with both, or think of another way to have ANALYZE not
recursively gather stats for each partition on partitioned tables.
Could we possibly get away with changing inheritance parent behaviour?
David
David Rowley <dgrowleyml@gmail.com> writes:
I feel like we might need to either bite the bullet and make ONLY work
consistently with both, or think of another way to have ANALYZE not
recursively gather stats for each partition on partitioned tables.
Could we possibly get away with changing inheritance parent behaviour?
Yeah, my thought was to change the behavior so it's consistent in
that case too. This doesn't seem to me like a change that'd
really cause anybody serious problems: ANALYZE without ONLY would
do more than before, but it's work you probably want done anyway.
regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> wrote:
Yeah, my thought was to change the behavior so it's consistent in
that case too. This doesn't seem to me like a change that'd
really cause anybody serious problems: ANALYZE without ONLY would
do more than before, but it's work you probably want done anyway.
Would we want to apply that change to VACUUM too? That might be a
bit drastic, especially if you had a multi-level inheritance structure featuring
large tables.
It feels a bit like VACUUM and ANALYZE have opposite natural defaults here.
For VACUUM it does not make much sense to vacuum only at the partitioned
table level and not include the partitions, since it would do nothing
- that might
be why the existing code always adds the partitions.
On Thu, 22 Aug 2024 at 13:38, Michael Harris <harmic@gmail.com> wrote:
Would we want to apply that change to VACUUM too? That might be a
bit drastic, especially if you had a multi-level inheritance structure featuring
large tables.
I think they'd need to work the same way as for "VACUUM (ANALYZE)", it
would be strange to analyze some tables that you didn't vacuum. It's
just a much bigger pill to swallow in terms of the additional effort.
It feels a bit like VACUUM and ANALYZE have opposite natural defaults here.
For VACUUM it does not make much sense to vacuum only at the partitioned
table level and not include the partitions, since it would do nothing
- that might
be why the existing code always adds the partitions.
Yeah, I suspect that's exactly why it was coded that way.
David
Hi All,
Here is a first draft of a patch to implement the ONLY option for
VACUUM and ANALYZE.
I'm a little nervous about the implications of changing the behaviour of VACUUM
for inheritance structures; I can imagine people having regularly
executed scripts
that currently vacuum all the tables in their inheritance structure;
after this change
they might get more vacuuming than they bargained for.
It's my first attempt to submit a patch here so apologies if I've
missed any part
of the process.
Cheers
Mike
Show quoted text
On Thu, 22 Aug 2024 at 12:09, David Rowley <dgrowleyml@gmail.com> wrote:
On Thu, 22 Aug 2024 at 13:38, Michael Harris <harmic@gmail.com> wrote:
Would we want to apply that change to VACUUM too? That might be a
bit drastic, especially if you had a multi-level inheritance structure featuring
large tables.I think they'd need to work the same way as for "VACUUM (ANALYZE)", it
would be strange to analyze some tables that you didn't vacuum. It's
just a much bigger pill to swallow in terms of the additional effort.It feels a bit like VACUUM and ANALYZE have opposite natural defaults here.
For VACUUM it does not make much sense to vacuum only at the partitioned
table level and not include the partitions, since it would do nothing
- that might
be why the existing code always adds the partitions.Yeah, I suspect that's exactly why it was coded that way.
David
Attachments:
0001-Initial-implementation-of-the-ONLY-feature.patchapplication/octet-stream; name=0001-Initial-implementation-of-the-ONLY-feature.patchDownload
From 6e8a1e103d87e5644fea0a547e8b23b77ff8f78d Mon Sep 17 00:00:00 2001
From: Michael Harris <harmic@gmail.com>
Date: Thu, 22 Aug 2024 18:45:07 +1000
Subject: [PATCH] Initial implementation of the 'ONLY' feature for ANALYZE and
VACUUM.
This patch also changes the behaivour of ANALYZE and
VACUUM when given on parent tables in an inheritance
structure, to align with the case of partitioned tables.
In the absence of ONLY, descendant tables will be
included.
Refer to the thread at:
https://www.postgresql.org/message-id/flat/CADofcAWATx_haD%3DQkSxHbnTsAe6%2Be0Aw8Eh4H8cXyogGvn_kOg%40mail.gmail.com
---
compile_flags.txt | 1 +
doc/src/sgml/ref/analyze.sgml | 11 +++--
doc/src/sgml/ref/vacuum.sgml | 10 +++--
src/backend/commands/vacuum.c | 30 ++++++++++----
src/backend/parser/gram.y | 2 +-
src/test/regress/expected/vacuum.out | 62 ++++++++++++++++++++++++++++
src/test/regress/sql/vacuum.sql | 40 ++++++++++++++++++
7 files changed, 140 insertions(+), 16 deletions(-)
create mode 100644 compile_flags.txt
diff --git a/compile_flags.txt b/compile_flags.txt
new file mode 100644
index 0000000000..7297dbe134
--- /dev/null
+++ b/compile_flags.txt
@@ -0,0 +1 @@
+-Isrc/include/
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 2b94b378e9..9ed39b0206 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
<phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
@@ -142,9 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If the specified table is a partitioned table, both the
- inheritance statistics of the partitioned table as a whole and
- statistics of the individual partitions are updated.
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are analyzed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9857b35627..a0e062baa1 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
<phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
@@ -401,8 +401,12 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If the specified table is a partitioned
- table, all of its leaf partitions are vacuumed.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified before
+ the table name, only that table is vacuumed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are vacuumed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..cd37f90687 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -24,6 +24,7 @@
#include "postgres.h"
#include <math.h>
+#include <stdbool.h>
#include "access/clog.h"
#include "access/commit_ts.h"
@@ -879,11 +880,12 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
}
else
{
- /* Process a specific relation, and possibly partitions thereof */
+ /* Process a specific relation, and possibly partitions and/or child tables thereof */
Oid relid;
HeapTuple tuple;
Form_pg_class classForm;
- bool include_parts;
+ bool include_children;
+ bool is_partitioned_table;
int rvr_opts;
/*
@@ -944,20 +946,32 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
MemoryContextSwitchTo(oldcontext);
}
+ /*
+ * Vacuuming a partitioned table with ONLY will not do anything
+ * since the partitioned table itself is empty. Issue a warning
+ * if the user requests this.
+ */
+ include_children = vrel->relation->inh;
+ is_partitioned_table = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && ! include_children)
+ ereport(WARNING,
+ (errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
+ vrel->relation->relname)));
+
- include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
ReleaseSysCache(tuple);
/*
- * If it is, make relation list entries for its partitions. Note that
+ * Unless the user has specified ONLY, make relation list entries for
+ * its partitions and/or descendant tables. Note that
* the list returned by find_all_inheritors() includes the passed-in
* OID, so we have to skip that. There's no point in taking locks on
- * the individual partitions yet, and doing so would just add
+ * the individual partitions/tables yet, and doing so would just add
* unnecessary deadlock risk. For this last reason we do not check
- * yet the ownership of the partitions, which get added to the list to
- * process. Ownership will be checked later on anyway.
+ * yet the ownership of the partitions/tables, which get added to the
+ * list to process. Ownership will be checked later on anyway.
*/
- if (include_parts)
+ if (include_children)
{
List *part_oids = find_all_inheritors(relid, NoLock, NULL);
ListCell *part_lc;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b7d98eb9f0..7cb3a9092a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -11939,7 +11939,7 @@ opt_name_list:
;
vacuum_relation:
- qualified_name opt_name_list
+ relation_expr opt_name_list
{
$$ = (Node *) makeVacuumRelation($1, InvalidOid, $2);
}
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 2eba712887..e1827e6e10 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,6 +291,68 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF vacparted FOR VALUES IN (1);
+ERROR: partition "only_parted1" would overlap partition "vacparted1"
+LINE 1: ...TABLE only_parted1 PARTITION OF vacparted FOR VALUES IN (1);
+ ^
+INSERT INTO only_parted1 VALUES (1, 'a');
+ERROR: relation "only_parted1" does not exist
+LINE 1: INSERT INTO only_parted1 VALUES (1, 'a');
+ ^
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is null, last_vacuum is null FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | ?column? | ?column?
+-------------+----------+----------
+ only_parted | t | t
+(1 row)
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is null, last_vacuum is null FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | ?column? | ?column?
+-------------+----------+----------
+ only_parted | t | t
+(1 row)
+
+VACUUM ONLY vacparted; -- gives warning
+WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is null, last_vacuum is null FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | ?column? | ?column?
+-----------------+----------+----------
+ only_inh_child | t | t
+ only_inh_parent | f | t
+(2 rows)
+
+-- Parent and child is ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is null, last_vacuum is null FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | ?column? | ?column?
+-----------------+----------+----------
+ only_inh_child | f | t
+ only_inh_parent | f | t
+(2 rows)
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ERROR: relation "does_not_exist" does not exist
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 548cd7acca..510991377a 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,6 +233,46 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF vacparted FOR VALUES IN (1);
+INSERT INTO only_parted1 VALUES (1, 'a');
+
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is null, last_vacuum is null FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is null, last_vacuum is null FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+VACUUM ONLY vacparted; -- gives warning
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is null, last_vacuum is null FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child is ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is null, last_vacuum is null FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ANALYZE (nonexistent-arg) does_not_exist;
--
2.31.1
Hi Michael,
Thanks for the patch.
I quickly tried running some ANALYZE ONLY queries, it seems like it works
fine.
-ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] )
] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] )
] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable>
[, ...] ]
It seems like extended_relation_expr allows "tablename *" syntax too. That
should be added in docs as well. (Same for VACUUM doc)
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics
by
sampling rows from all partitions; in addition, it will recurse into
each
partition and update its statistics. Each leaf partition is analyzed
only
once, even with multi-level partitioning. No statistics are collected
for
only the parent table (without data from its partitions), because with
partitioning it's guaranteed to be empty.
</para>
We may also want to update the above note in ANALYZE doc.
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a); +CREATE TABLE only_parted1 PARTITION OF vacparted FOR VALUES IN (1);
+INSERT INTO only_parted1 VALUES (1, 'a');
Tests don't seem right to me.
I believe the above should be " PARTITION OF only_parted " instead of
vacparted.
It may better to insert into partitioned table (only_parted) instead of the
partition (only_parted1);
Also it may be a good idea to test VACUUM ONLY for inheritance tables the
same way you test ANALYZE ONLY.
Lastly, the patch includes an unrelated file (compile_flags.txt) and has
whitespace errors when I apply it.
Regards,
--
Melih Mutlu
Microsoft
Thanks for the feedback Melih,
On Thu, 22 Aug 2024 at 21:26, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
It seems like extended_relation_expr allows "tablename *" syntax too. That should be added in docs as well. (Same for VACUUM doc)
I had included it in the parameter description but had missed it from
the synopsis. I've fixed that now.
We may also want to update the above note in ANALYZE doc.
Well spotted. I have updated that now.
Tests don't seem right to me.
I believe the above should be " PARTITION OF only_parted " instead of vacparted.
It may better to insert into partitioned table (only_parted) instead of the partition (only_parted1);Also it may be a good idea to test VACUUM ONLY for inheritance tables the same way you test ANALYZE ONLY.
Well spotted again. I have fixed the incorrect table names and added
tests as suggested.
Lastly, the patch includes an unrelated file (compile_flags.txt) and has whitespace errors when I apply it.
Oops! Fixed,
V2 of the patch is attached.
Cheers
Mike
Attachments:
v2-0001-Initial-implementation-of-the-ONLY-feature.patchapplication/octet-stream; name=v2-0001-Initial-implementation-of-the-ONLY-feature.patchDownload
From 837a792664c9430f05510593b2171a7a22ff82f4 Mon Sep 17 00:00:00 2001
From: Michael Harris <harmic@gmail.com>
Date: Fri, 23 Aug 2024 19:58:17 +1000
Subject: [PATCH v2] Initial implementation of the 'ONLY' feature for ANALYZE
and VACUUM.
This patch also changes the behaivour of ANALYZE and
VACUUM when given on parent tables in an inheritance
structure, to align with the case of partitioned tables.
In the absence of ONLY, descendant tables will be
included.
Refer to the thread at:
https://www.postgresql.org/message-id/flat/CADofcAWATx_haD%3DQkSxHbnTsAe6%2Be0Aw8Eh4H8cXyogGvn_kOg%40mail.gmail.com
---
doc/src/sgml/ref/analyze.sgml | 24 +++++----
doc/src/sgml/ref/vacuum.sgml | 12 +++--
src/backend/commands/vacuum.c | 29 +++++++---
src/backend/parser/gram.y | 2 +-
src/test/regress/expected/vacuum.out | 80 ++++++++++++++++++++++++++++
src/test/regress/sql/vacuum.sql | 52 ++++++++++++++++++
6 files changed, 176 insertions(+), 23 deletions(-)
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 2b94b378e9..d3bf865ef3 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
<phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
@@ -31,7 +31,7 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -142,9 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If the specified table is a partitioned table, both the
- inheritance statistics of the partitioned table as a whole and
- statistics of the individual partitions are updated.
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are analyzed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
@@ -295,11 +298,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
- sampling rows from all partitions; in addition, it will recurse into each
- partition and update its statistics. Each leaf partition is analyzed only
- once, even with multi-level partitioning. No statistics are collected for
- only the parent table (without data from its partitions), because with
- partitioning it's guaranteed to be empty.
+ sampling rows from all partitions. If <literal>ONLY</literal> was not
+ specified, it will also recurse into each partition and update its
+ statistics. Each leaf partition is analyzed only once, even with
+ multi-level partitioning. No statistics are collected for only the
+ parent table (without data from its partitions), because with
+ partitioning it is guaranteed to be empty.
</para>
<para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9857b35627..cb52c270c6 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
<refsynopsisdiv>
<synopsis>
-VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
<phrase>where <replaceable class="parameter">option</replaceable> can be one of:</phrase>
@@ -42,7 +42,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -401,8 +401,12 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If the specified table is a partitioned
- table, all of its leaf partitions are vacuumed.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified before
+ the table name, only that table is vacuumed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are vacuumed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..55000e7451 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -879,11 +879,12 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
}
else
{
- /* Process a specific relation, and possibly partitions thereof */
+ /* Process a specific relation, and possibly partitions and/or child tables thereof */
Oid relid;
HeapTuple tuple;
Form_pg_class classForm;
- bool include_parts;
+ bool include_children;
+ bool is_partitioned_table;
int rvr_opts;
/*
@@ -944,20 +945,32 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
MemoryContextSwitchTo(oldcontext);
}
+ /*
+ * Vacuuming a partitioned table with ONLY will not do anything
+ * since the partitioned table itself is empty. Issue a warning
+ * if the user requests this.
+ */
+ include_children = vrel->relation->inh;
+ is_partitioned_table = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && ! include_children)
+ ereport(WARNING,
+ (errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
+ vrel->relation->relname)));
+
- include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
ReleaseSysCache(tuple);
/*
- * If it is, make relation list entries for its partitions. Note that
+ * Unless the user has specified ONLY, make relation list entries for
+ * its partitions and/or descendant tables. Note that
* the list returned by find_all_inheritors() includes the passed-in
* OID, so we have to skip that. There's no point in taking locks on
- * the individual partitions yet, and doing so would just add
+ * the individual partitions/tables yet, and doing so would just add
* unnecessary deadlock risk. For this last reason we do not check
- * yet the ownership of the partitions, which get added to the list to
- * process. Ownership will be checked later on anyway.
+ * yet the ownership of the partitions/tables, which get added to the
+ * list to process. Ownership will be checked later on anyway.
*/
- if (include_parts)
+ if (include_children)
{
List *part_oids = find_all_inheritors(relid, NoLock, NULL);
ListCell *part_lc;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index b7d98eb9f0..7cb3a9092a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -11939,7 +11939,7 @@ opt_name_list:
;
vacuum_relation:
- qualified_name opt_name_list
+ relation_expr opt_name_list
{
$$ = (Node *) makeVacuumRelation($1, InvalidOid, $2);
}
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 2eba712887..7165235ece 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,6 +291,86 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | f | f
+(2 rows)
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | t | f
+(2 rows)
+
+VACUUM ONLY vacparted; -- gives warning
+WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | f | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | t
+(2 rows)
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | t
+ only_inh_parent | t | t
+(2 rows)
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ERROR: relation "does_not_exist" does not exist
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 548cd7acca..fc84df0f07 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,6 +233,58 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+VACUUM ONLY vacparted; -- gives warning
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ANALYZE (nonexistent-arg) does_not_exist;
--
2.31.1
Hi Michael,
Michael Harris <harmic@gmail.com>, 23 Ağu 2024 Cum, 13:01 tarihinde şunu
yazdı:
V2 of the patch is attached.
Thanks for updating the patch. I have a few more minor feedbacks.
-ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] )
] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]
+ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] )
] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable>
[, ...] ]
I believe moving "[ ONLY ]" to the definitions of table_and_columns below,
as you did with "[ * ]", might be better to be consistent with other places
(see [1]https://www.postgresql.org/docs/16/sql-createpublication.html)
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && ! i
nclude_children)
There are also some issues with coding conventions in some places (e.g. the
space between "!" and "include_children" abode). I think running pgindent
would resolve such issue in most places.
[1]: https://www.postgresql.org/docs/16/sql-createpublication.html
Regards,
--
Melih Mutlu
Microsoft
Hi Michael,
On 2024-08-23 19:01, Michael Harris wrote:
V2 of the patch is attached.
Thanks for the proposal and the patch.
You said this patch was a first draft version, so these may be too minor
comments, but I will share them:
-- https://www.postgresql.org/docs/devel/progress-reporting.html
Note that when ANALYZE is run on a partitioned table, all of its
partitions are also recursively analyzed.
Should we also note this is the default, i.e. not using ONLY option
behavior here?
-- https://www.postgresql.org/docs/devel/ddl-partitioning.html
If you are using manual VACUUM or ANALYZE commands, don't forget that
you need to run them on each child table individually. A command like:ANALYZE measurement;
will only process the root table.
This part also should be modified, shouldn't it?
When running ANALYZE VERBOSE ONLY on a partition table, the INFO message
is like this:
=# ANALYZE VERBOSE ONLY only_parted;
INFO: analyzing "public.only_parted" inheritance tree
I may be wrong, but 'inheritance tree' makes me feel it includes child
tables.
Removing 'inheritance tree' and just describing the table name as below
might be better:
INFO: analyzing "public.only_parted"
--
Regards,
--
Atsushi Torikoshi
NTT DATA Group Corporation
On Thu, Aug 29, 2024 at 7:31 PM Melih Mutlu <m.melihmutlu@gmail.com> wrote:
Hi Michael,
Michael Harris <harmic@gmail.com>, 23 Ağu 2024 Cum, 13:01 tarihinde şunu yazdı:
V2 of the patch is attached.
Thanks for updating the patch. I have a few more minor feedbacks.
-ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <replaceable class="parameter">table_and_columns</replaceable> [, ...] ] +ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ [ ONLY ] <replaceable class="parameter">table_and_columns</replaceable> [, ...] ]I believe moving "[ ONLY ]" to the definitions of table_and_columns below, as you did with "[ * ]", might be better to be consistent with other places (see [1])
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && ! include_children)
I think you are right.
ANALYZE [ ( option [, ...] ) ] [ [ ONLY ] table_and_columns [, ...] ]
seems not explain commands like:
ANALYZE ONLY only_parted(a), ONLY only_parted(b);
Hi Michael,
On Fri, 23 Aug 2024 at 22:01, Michael Harris <harmic@gmail.com> wrote:
V2 of the patch is attached.
(I understand this is your first PostgreSQL patch)
I just wanted to make sure you knew about the commitfest cycle we use
for the development of PostgreSQL. If you've not got one already, can
you register a community account. That'll allow you to include this
patch in the September commitfest that starts on the 1st. See
https://commitfest.postgresql.org/49/
I understand there's now some sort of cool-off period for new
community accounts, so if you have trouble adding the patch before the
CF starts, let me know and I should be able to do it for you. The
commitfest normally closes for new patches around 12:00 UTC on the 1st
of the month.
There's a bit more information in
https://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submission
David
Hello Atsushi & Melih
Thank you both for your further feedback.
On Thu, 29 Aug 2024 at 21:31, Melih Mutlu <m.melihmutlu@gmail.com> wrote:
I believe moving "[ ONLY ]" to the definitions of table_and_columns below, as you did with "[ * ]", might be better to be consistent with other places (see [1])
Agreed. I have updated this.
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && ! include_children)
There are also some issues with coding conventions in some places (e.g. the space between "!" and "include_children" abode). I think running pgindent would resolve such issue in most places.
I fixed that extra space, and then ran pgindent. It did not report any
more problems.
On Fri, 30 Aug 2024 at 16:45, torikoshia <torikoshia@oss.nttdata.com> wrote:
-- https://www.postgresql.org/docs/devel/progress-reporting.html
Note that when ANALYZE is run on a partitioned table, all of its
partitions are also recursively analyzed.Should we also note this is the default, i.e. not using ONLY option
behavior here?
-- https://www.postgresql.org/docs/devel/ddl-partitioning.html
If you are using manual VACUUM or ANALYZE commands, don't forget that
you need to run them on each child table individually. A command like:ANALYZE measurement;
will only process the root table.This part also should be modified, shouldn't it?
Agreed. I have updated both of these.
When running ANALYZE VERBOSE ONLY on a partition table, the INFO message
is like this:=# ANALYZE VERBOSE ONLY only_parted;
INFO: analyzing "public.only_parted" inheritance treeI may be wrong, but 'inheritance tree' makes me feel it includes child
tables.
Removing 'inheritance tree' and just describing the table name as below
might be better:INFO: analyzing "public.only_parted"
I'm not sure about that one. If I understand the source correctly,
that particular progress
message tells the user that the system is gathering stats from the inheritance
tree in order to update the stats of the given table, not that it is
actually updating
the stats of the descendant tables.
When analyzing an inheritance structure with the ONLY you see
something like this:
=> ANALYZE VERBOSE ONLY only_inh_parent;
INFO: analyzing "public.only_inh_parent"
INFO: "only_inh_parent": scanned 0 of 0 pages, containing 0 live rows
and 0 dead rows; 0 rows in sample, 0 estimated total rows
INFO: analyzing "public.only_inh_parent" inheritance tree
INFO: "only_inh_child": scanned 1 of 1 pages, containing 3 live rows
and 0 dead rows; 3 rows in sample, 3 estimated total rows
ANALYZE
The reason you don't see the first one for partitioned tables is that
it corresponds
to sampling the contents of the parent table itself, which in the case
of a partitioned
table is guaranteed to be empty, so it is skipped.
I agree the text could be confusing, and in fact is probably confusing
even today
without the ONLY keyword, but I'm not sure what would be the best
wording to cover both the partitioned and inherited cases.
v3 of the patch is attached, and I will submit it to the commitfest.
Attachments:
v3-0001-Implementation-of-the-ONLY-feature.patchapplication/octet-stream; name=v3-0001-Implementation-of-the-ONLY-feature.patchDownload
From 55f4109bd78f84d45912c14cfceffd105ed05485 Mon Sep 17 00:00:00 2001
From: Michael Harris <harmic@gmail.com>
Date: Sun, 1 Sep 2024 11:27:56 +1000
Subject: [PATCH v3] Implementation of the 'ONLY' feature for ANALYZE and
VACUUM.
This patch also changes the behaivour of ANALYZE and
VACUUM when given on parent tables in an inheritance
structure, to align with the case of partitioned tables.
In the absence of ONLY, descendant tables will be
included.
Refer to the thread at:
https://www.postgresql.org/message-id/flat/CADofcAWATx_haD%3DQkSxHbnTsAe6%2Be0Aw8Eh4H8cXyogGvn_kOg%40mail.gmail.com
---
doc/src/sgml/ddl.sgml | 8 +--
doc/src/sgml/monitoring.sgml | 6 +--
doc/src/sgml/ref/analyze.sgml | 22 ++++----
doc/src/sgml/ref/vacuum.sgml | 10 ++--
src/backend/commands/vacuum.c | 36 +++++++++----
src/backend/parser/gram.y | 2 +-
src/test/regress/expected/vacuum.out | 80 ++++++++++++++++++++++++++++
src/test/regress/sql/vacuum.sql | 52 ++++++++++++++++++
8 files changed, 186 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b671858627..4aa945af02 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -4854,11 +4854,11 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<listitem>
<para>
- If you are using manual <command>VACUUM</command> or
- <command>ANALYZE</command> commands, don't forget that
- you need to run them on each child table individually. A command like:
+ Manual <command>VACUUM</command> or <command>ANALYZE</command> commands
+ will automatically process all descendant tables. If this is undesirable,
+ you can use the <literal>ONLY</literal> keyword. A command like:
<programlisting>
-ANALYZE measurement;
+ANALYZE ONLY measurement;
</programlisting>
will only process the root table.
</para>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 55417a6fa9..c428d673fd 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5449,9 +5449,9 @@ FROM pg_stat_get_backend_idset() AS backendid;
<note>
<para>
- Note that when <command>ANALYZE</command> is run on a partitioned table,
- all of its partitions are also recursively analyzed.
- In that case, <command>ANALYZE</command>
+ Note that when <command>ANALYZE</command> is run on a partitioned table
+ without the <literal>ONLY</literal> keyword, all of its partitions are
+ also recursively analyzed. In that case, <command>ANALYZE</command>
progress is reported first for the parent table, whereby its inheritance
statistics are collected, followed by that for each partition.
</para>
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 2b94b378e9..ef0f4f0caf 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -31,7 +31,7 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -142,9 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If the specified table is a partitioned table, both the
- inheritance statistics of the partitioned table as a whole and
- statistics of the individual partitions are updated.
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are analyzed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
@@ -295,11 +298,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
- sampling rows from all partitions; in addition, it will recurse into each
- partition and update its statistics. Each leaf partition is analyzed only
- once, even with multi-level partitioning. No statistics are collected for
- only the parent table (without data from its partitions), because with
- partitioning it's guaranteed to be empty.
+ sampling rows from all partitions. If <literal>ONLY</literal> was not
+ specified, it will also recurse into each partition and update its
+ statistics. Each leaf partition is analyzed only once, even with
+ multi-level partitioning. No statistics are collected for only the
+ parent table (without data from its partitions), because with
+ partitioning it is guaranteed to be empty.
</para>
<para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9857b35627..a3347bde35 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -42,7 +42,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -401,8 +401,12 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If the specified table is a partitioned
- table, all of its leaf partitions are vacuumed.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified before
+ the table name, only that table is vacuumed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are vacuumed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..9400bd7bf6 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -879,11 +879,15 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
}
else
{
- /* Process a specific relation, and possibly partitions thereof */
+ /*
+ * Process a specific relation, and possibly partitions and/or child
+ * tables thereof
+ */
Oid relid;
HeapTuple tuple;
Form_pg_class classForm;
- bool include_parts;
+ bool include_children;
+ bool is_partitioned_table;
int rvr_opts;
/*
@@ -944,20 +948,32 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
MemoryContextSwitchTo(oldcontext);
}
+ /*
+ * Vacuuming a partitioned table with ONLY will not do anything since
+ * the partitioned table itself is empty. Issue a warning if the user
+ * requests this.
+ */
+ include_children = vrel->relation->inh;
+ is_partitioned_table = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && !include_children)
+ ereport(WARNING,
+ (errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
+ vrel->relation->relname)));
+
- include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
ReleaseSysCache(tuple);
/*
- * If it is, make relation list entries for its partitions. Note that
- * the list returned by find_all_inheritors() includes the passed-in
- * OID, so we have to skip that. There's no point in taking locks on
- * the individual partitions yet, and doing so would just add
+ * Unless the user has specified ONLY, make relation list entries for
+ * its partitions and/or descendant tables. Note that the list
+ * returned by find_all_inheritors() includes the passed-in OID, so we
+ * have to skip that. There's no point in taking locks on the
+ * individual partitions/tables yet, and doing so would just add
* unnecessary deadlock risk. For this last reason we do not check
- * yet the ownership of the partitions, which get added to the list to
- * process. Ownership will be checked later on anyway.
+ * yet the ownership of the partitions/tables, which get added to the
+ * list to process. Ownership will be checked later on anyway.
*/
- if (include_parts)
+ if (include_children)
{
List *part_oids = find_all_inheritors(relid, NoLock, NULL);
ListCell *part_lc;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 84cef57a70..99d4ece308 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -11886,7 +11886,7 @@ opt_name_list:
;
vacuum_relation:
- qualified_name opt_name_list
+ relation_expr opt_name_list
{
$$ = (Node *) makeVacuumRelation($1, InvalidOid, $2);
}
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 2eba712887..7165235ece 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,6 +291,86 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | f | f
+(2 rows)
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | t | f
+(2 rows)
+
+VACUUM ONLY vacparted; -- gives warning
+WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | f | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | t
+(2 rows)
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | t
+ only_inh_parent | t | t
+(2 rows)
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ERROR: relation "does_not_exist" does not exist
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 548cd7acca..fc84df0f07 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,6 +233,58 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+VACUUM ONLY vacparted; -- gives warning
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ANALYZE (nonexistent-arg) does_not_exist;
--
2.31.1
Thanks David
I had not read that wiki page well enough, so many thanks for alerting
me that I had to create a commitfest entry. I already had a community
account so I was able to login and create it here:
https://commitfest.postgresql.org/49/5226/
I was not sure what to put for some of the fields - for 'reviewer' should
I have put the people who have provided feedback on this thread?
Is there anything else I need to do?
Thanks again
Cheers
Mike
Show quoted text
On Fri, 30 Aug 2024 at 18:44, David Rowley <dgrowleyml@gmail.com> wrote:
Hi Michael,
On Fri, 23 Aug 2024 at 22:01, Michael Harris <harmic@gmail.com> wrote:
V2 of the patch is attached.
(I understand this is your first PostgreSQL patch)
I just wanted to make sure you knew about the commitfest cycle we use
for the development of PostgreSQL. If you've not got one already, can
you register a community account. That'll allow you to include this
patch in the September commitfest that starts on the 1st. See
https://commitfest.postgresql.org/49/I understand there's now some sort of cool-off period for new
community accounts, so if you have trouble adding the patch before the
CF starts, let me know and I should be able to do it for you. The
commitfest normally closes for new patches around 12:00 UTC on the 1st
of the month.There's a bit more information in
https://wiki.postgresql.org/wiki/Submitting_a_Patch#Patch_submissionDavid
On Sun, 1 Sept 2024 at 13:41, Michael Harris <harmic@gmail.com> wrote:
https://commitfest.postgresql.org/49/5226/
I was not sure what to put for some of the fields - for 'reviewer' should
I have put the people who have provided feedback on this thread?
I think it's fairly common that only reviewers either add themselves
or don't bother. I don't think it's common that patch authors add
reviewers. Some reviewers like their reviews to be more informal and
putting themselves as a reviewer can put other people off reviewing as
they think it's already handled by someone else. That may result in
the patch being neglected by reviewers.
Is there anything else I need to do?
I see you set the target version as 17. That should be blank or "18".
PG17 is about to go into release candidate, so it is too late for
that. It's been fairly consistent that we open for new patches in
early July and close before the 2nd week of April the following year.
We're currently 2 months into PG18 dev. We don't back-patch new
features.
Nothing else aside from continuing to address reviews, as you are already.
David
On 2024-09-01 10:31, Michael Harris wrote:
Hello Atsushi & Melih
Thank you both for your further feedback.
On Thu, 29 Aug 2024 at 21:31, Melih Mutlu <m.melihmutlu@gmail.com>
wrote:I believe moving "[ ONLY ]" to the definitions of table_and_columns
below, as you did with "[ * ]", might be better to be consistent with
other places (see [1])Agreed. I have updated this.
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && !
include_children)There are also some issues with coding conventions in some places
(e.g. the space between "!" and "include_children" abode). I think
running pgindent would resolve such issue in most places.I fixed that extra space, and then ran pgindent. It did not report any
more problems.On Fri, 30 Aug 2024 at 16:45, torikoshia <torikoshia@oss.nttdata.com>
wrote:-- https://www.postgresql.org/docs/devel/progress-reporting.html
Note that when ANALYZE is run on a partitioned table, all of its
partitions are also recursively analyzed.Should we also note this is the default, i.e. not using ONLY option
behavior here?-- https://www.postgresql.org/docs/devel/ddl-partitioning.html
If you are using manual VACUUM or ANALYZE commands, don't forget that
you need to run them on each child table individually. A command like:ANALYZE measurement;
will only process the root table.This part also should be modified, shouldn't it?
Agreed. I have updated both of these.
Thanks!
When running ANALYZE VERBOSE ONLY on a partition table, the INFO
message
is like this:=# ANALYZE VERBOSE ONLY only_parted;
INFO: analyzing "public.only_parted" inheritance treeI may be wrong, but 'inheritance tree' makes me feel it includes child
tables.
Removing 'inheritance tree' and just describing the table name as
below
might be better:INFO: analyzing "public.only_parted"
I'm not sure about that one. If I understand the source correctly,
that particular progress
message tells the user that the system is gathering stats from the
inheritance
tree in order to update the stats of the given table, not that it is
actually updating
the stats of the descendant tables.
That makes sense.
When analyzing an inheritance structure with the ONLY you see
something like this:=> ANALYZE VERBOSE ONLY only_inh_parent;
INFO: analyzing "public.only_inh_parent"
INFO: "only_inh_parent": scanned 0 of 0 pages, containing 0 live rows
and 0 dead rows; 0 rows in sample, 0 estimated total rows
INFO: analyzing "public.only_inh_parent" inheritance tree
INFO: "only_inh_child": scanned 1 of 1 pages, containing 3 live rows
and 0 dead rows; 3 rows in sample, 3 estimated total rows
ANALYZEThe reason you don't see the first one for partitioned tables is that
it corresponds
to sampling the contents of the parent table itself, which in the case
of a partitioned
table is guaranteed to be empty, so it is skipped.I agree the text could be confusing, and in fact is probably confusing
even today
without the ONLY keyword,
Yeah, it seems this isn't dependent on your proposal.
(BTW I'm also wondering if the expression “inheritance" is appropriate
when the target is a partitioned table, but this should be discussed in
another thread)
--
Regards,
--
Atsushi Torikoshi
NTT DATA Group Corporation
On Sun, 1 Sept 2024 at 13:32, Michael Harris <harmic@gmail.com> wrote:
v3 of the patch is attached, and I will submit it to the commitfest.
I think this patch is pretty good.
I only have a few things I'd point out:
1. The change to ddl.sgml, technically you're removing the caveat, but
I think the choice you've made to mention the updated behaviour is
likely a good idea still. So I agree with this change, but just wanted
to mention it as someone else might think otherwise.
2. I think there's some mixing of tense in the changes to analyze.sgml:
"If <literal>ONLY</literal> was not specified, it will also recurse
into each partition and update its statistics."
You've written "was" (past tense), but then the existing text uses
"will" (future tense). I guess if the point in time is after parse and
before work has been done, then that's correct, but I think using "is"
instead of "was" is better.
3. In vacuum.sgml;
"If <literal>ONLY</literal> is not specified, the table and all its
descendant tables or partitions (if any) are vacuumed"
Maybe "are also vacuumed" instead of "are vacuumed" is more clear?
It's certainly true for inheritance parents, but I guess you could
argue that's wrong for partitioned tables.
4. A very minor detail, but I think in vacuum.c the WARNING you've
added should use RelationGetRelationName(). We seem to be very
inconsistent with using that macro and I see it's not used just above
for the lock warning, which I imagine you copied.
Aside from those, that just leaves me with the behavioural change. I
noted Tom was ok with the change in behaviour for ANALYZE (mentioned
in [1]/messages/by-id/1919201.1724289136@sss.pgh.pa.us). Tom, wondering if you feel the same for VACUUM too? If we're
doing this, I think we'd need to be quite clear about it on the
release notes.
David
Thanks for the feedback David.
On Mon, 9 Sept 2024 at 11:27, David Rowley <dgrowleyml@gmail.com> wrote:
You've written "was" (past tense), but then the existing text uses
"will" (future tense). I guess if the point in time is after parse and
before work has been done, then that's correct, but I think using "is"
instead of "was" is better.
Maybe "are also vacuumed" instead of "are vacuumed" is more clear?
Agreed. I have updated my patch with both of these suggestions.
4. A very minor detail, but I think in vacuum.c the WARNING you've
added should use RelationGetRelationName(). We seem to be very
inconsistent with using that macro and I see it's not used just above
for the lock warning, which I imagine you copied.
As far as I can tell RelationGetRelationName is for extracting the name
from a Relation struct, but in this case we have a RangeVar so it doesn't appear
to be applicable. I could not find an equivalent access macro for RangeVar.
Thanks again.
Cheers
Mike
Attachments:
v4-0001-Implementation-of-the-ONLY-feature.patchapplication/octet-stream; name=v4-0001-Implementation-of-the-ONLY-feature.patchDownload
From 7e50c93e1fedbb547405a1f26331156f9c5121b4 Mon Sep 17 00:00:00 2001
From: Michael Harris <harmic@gmail.com>
Date: Mon, 9 Sep 2024 17:44:18 +1000
Subject: [PATCH v4] Implementation of the 'ONLY' feature for ANALYZE and
VACUUM.
This patch also changes the behaivour of ANALYZE and
VACUUM when given on parent tables in an inheritance
structure, to align with the case of partitioned tables.
In the absence of ONLY, descendant tables will be
included.
Refer to the thread at:
https://www.postgresql.org/message-id/flat/CADofcAWATx_haD%3DQkSxHbnTsAe6%2Be0Aw8Eh4H8cXyogGvn_kOg%40mail.gmail.com
---
doc/src/sgml/ddl.sgml | 8 +--
doc/src/sgml/monitoring.sgml | 6 +--
doc/src/sgml/ref/analyze.sgml | 22 ++++----
doc/src/sgml/ref/vacuum.sgml | 10 ++--
src/backend/commands/vacuum.c | 36 +++++++++----
src/backend/parser/gram.y | 2 +-
src/test/regress/expected/vacuum.out | 80 ++++++++++++++++++++++++++++
src/test/regress/sql/vacuum.sql | 52 ++++++++++++++++++
8 files changed, 186 insertions(+), 30 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b671858627..4aa945af02 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -4854,11 +4854,11 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<listitem>
<para>
- If you are using manual <command>VACUUM</command> or
- <command>ANALYZE</command> commands, don't forget that
- you need to run them on each child table individually. A command like:
+ Manual <command>VACUUM</command> or <command>ANALYZE</command> commands
+ will automatically process all descendant tables. If this is undesirable,
+ you can use the <literal>ONLY</literal> keyword. A command like:
<programlisting>
-ANALYZE measurement;
+ANALYZE ONLY measurement;
</programlisting>
will only process the root table.
</para>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 933de6fe07..a0545e3da5 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5522,9 +5522,9 @@ FROM pg_stat_get_backend_idset() AS backendid;
<note>
<para>
- Note that when <command>ANALYZE</command> is run on a partitioned table,
- all of its partitions are also recursively analyzed.
- In that case, <command>ANALYZE</command>
+ Note that when <command>ANALYZE</command> is run on a partitioned table
+ without the <literal>ONLY</literal> keyword, all of its partitions are
+ also recursively analyzed. In that case, <command>ANALYZE</command>
progress is reported first for the parent table, whereby its inheritance
statistics are collected, followed by that for each partition.
</para>
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 2b94b378e9..dfd8779be7 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -31,7 +31,7 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -142,9 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If the specified table is a partitioned table, both the
- inheritance statistics of the partitioned table as a whole and
- statistics of the individual partitions are updated.
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are analyzed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
@@ -295,11 +298,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
- sampling rows from all partitions; in addition, it will recurse into each
- partition and update its statistics. Each leaf partition is analyzed only
- once, even with multi-level partitioning. No statistics are collected for
- only the parent table (without data from its partitions), because with
- partitioning it's guaranteed to be empty.
+ sampling rows from all partitions. If <literal>ONLY</literal> is not
+ specified, it will also recurse into each partition and update its
+ statistics. Each leaf partition is analyzed only once, even with
+ multi-level partitioning. No statistics are collected for only the
+ parent table (without data from its partitions), because with
+ partitioning it is guaranteed to be empty.
</para>
<para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9857b35627..465f9fecf7 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -42,7 +42,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -401,8 +401,12 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If the specified table is a partitioned
- table, all of its leaf partitions are vacuumed.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified before
+ the table name, only that table is vacuumed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are also vacuumed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..9400bd7bf6 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -879,11 +879,15 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
}
else
{
- /* Process a specific relation, and possibly partitions thereof */
+ /*
+ * Process a specific relation, and possibly partitions and/or child
+ * tables thereof
+ */
Oid relid;
HeapTuple tuple;
Form_pg_class classForm;
- bool include_parts;
+ bool include_children;
+ bool is_partitioned_table;
int rvr_opts;
/*
@@ -944,20 +948,32 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
MemoryContextSwitchTo(oldcontext);
}
+ /*
+ * Vacuuming a partitioned table with ONLY will not do anything since
+ * the partitioned table itself is empty. Issue a warning if the user
+ * requests this.
+ */
+ include_children = vrel->relation->inh;
+ is_partitioned_table = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && !include_children)
+ ereport(WARNING,
+ (errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
+ vrel->relation->relname)));
+
- include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
ReleaseSysCache(tuple);
/*
- * If it is, make relation list entries for its partitions. Note that
- * the list returned by find_all_inheritors() includes the passed-in
- * OID, so we have to skip that. There's no point in taking locks on
- * the individual partitions yet, and doing so would just add
+ * Unless the user has specified ONLY, make relation list entries for
+ * its partitions and/or descendant tables. Note that the list
+ * returned by find_all_inheritors() includes the passed-in OID, so we
+ * have to skip that. There's no point in taking locks on the
+ * individual partitions/tables yet, and doing so would just add
* unnecessary deadlock risk. For this last reason we do not check
- * yet the ownership of the partitions, which get added to the list to
- * process. Ownership will be checked later on anyway.
+ * yet the ownership of the partitions/tables, which get added to the
+ * list to process. Ownership will be checked later on anyway.
*/
- if (include_parts)
+ if (include_children)
{
List *part_oids = find_all_inheritors(relid, NoLock, NULL);
ListCell *part_lc;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 84cef57a70..99d4ece308 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -11886,7 +11886,7 @@ opt_name_list:
;
vacuum_relation:
- qualified_name opt_name_list
+ relation_expr opt_name_list
{
$$ = (Node *) makeVacuumRelation($1, InvalidOid, $2);
}
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 2eba712887..7165235ece 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,6 +291,86 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | f | f
+(2 rows)
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | t | f
+(2 rows)
+
+VACUUM ONLY vacparted; -- gives warning
+WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | f | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | t
+(2 rows)
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | t
+ only_inh_parent | t | t
+(2 rows)
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ERROR: relation "does_not_exist" does not exist
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 548cd7acca..fc84df0f07 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,6 +233,58 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+VACUUM ONLY vacparted; -- gives warning
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ANALYZE (nonexistent-arg) does_not_exist;
--
2.31.1
On 2024-09-09 16:56, Michael Harris wrote:
Thanks for updating the patch.
Here is a minor comment.
@@ -944,20 +948,32 @@ expand_vacuum_rel(VacuumRelation *vrel,
MemoryContext vac_context,
MemoryContextSwitchTo(oldcontext);
}
..
+ * Unless the user has specified ONLY, make relation list entries for + * its partitions and/or descendant tables.
Regarding the addition of partition descendant tables, should we also
update the below comment on expand_vacuum_rel? Currently it refers only
partitions:
| * Given a VacuumRelation, fill in the table OID if it wasn't
specified,
| * and optionally add VacuumRelations for partitions of the table.
Other than this and the following, it looks good to me.
On Mon, Sep 9, 2024 at 10:27 AM David Rowley <dgrowleyml@gmail.com>
wrote:
Aside from those, that just leaves me with the behavioural change. I
noted Tom was ok with the change in behaviour for ANALYZE (mentioned
in [1]). Tom, wondering if you feel the same for VACUUM too? If we're
doing this, I think we'd need to be quite clear about it on the
release notes.
--
Regards,
--
Atsushi Torikoshi
NTT DATA Group Corporation
Thanks for the feedback.
On Tue, 10 Sept 2024 at 22:03, torikoshia <torikoshia@oss.nttdata.com> wrote:
Regarding the addition of partition descendant tables, should we also
update the below comment on expand_vacuum_rel? Currently it refers only
partitions:| * Given a VacuumRelation, fill in the table OID if it wasn't
specified,
| * and optionally add VacuumRelations for partitions of the table.
Well spotted! I have updated the comment to read:
* Given a VacuumRelation, fill in the table OID if it wasn't specified,
* and optionally add VacuumRelations for partitions or descendant tables
* of the table.
Cheers
Mike.
Attachments:
v5-0001-Implementation-of-the-ONLY-feature.patchapplication/octet-stream; name=v5-0001-Implementation-of-the-ONLY-feature.patchDownload
From cc829794cb5cc0bdc331d30a6f2721444d8365ba Mon Sep 17 00:00:00 2001
From: Michael Harris <harmic@gmail.com>
Date: Wed, 11 Sep 2024 17:44:00 +1000
Subject: [PATCH v5] Implementation of the 'ONLY' feature for ANALYZE and
VACUUM.
This patch also changes the behaivour of ANALYZE and
VACUUM when given on parent tables in an inheritance
structure, to align with the case of partitioned tables.
In the absence of ONLY, descendant tables will be
included.
Refer to the thread at:
https://www.postgresql.org/message-id/flat/CADofcAWATx_haD%3DQkSxHbnTsAe6%2Be0Aw8Eh4H8cXyogGvn_kOg%40mail.gmail.com
---
doc/src/sgml/ddl.sgml | 8 +--
doc/src/sgml/monitoring.sgml | 6 +--
doc/src/sgml/ref/analyze.sgml | 22 ++++----
doc/src/sgml/ref/vacuum.sgml | 10 ++--
src/backend/commands/vacuum.c | 39 ++++++++++----
src/backend/parser/gram.y | 2 +-
src/test/regress/expected/vacuum.out | 80 ++++++++++++++++++++++++++++
src/test/regress/sql/vacuum.sql | 52 ++++++++++++++++++
8 files changed, 188 insertions(+), 31 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b671858627..4aa945af02 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -4854,11 +4854,11 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<listitem>
<para>
- If you are using manual <command>VACUUM</command> or
- <command>ANALYZE</command> commands, don't forget that
- you need to run them on each child table individually. A command like:
+ Manual <command>VACUUM</command> or <command>ANALYZE</command> commands
+ will automatically process all descendant tables. If this is undesirable,
+ you can use the <literal>ONLY</literal> keyword. A command like:
<programlisting>
-ANALYZE measurement;
+ANALYZE ONLY measurement;
</programlisting>
will only process the root table.
</para>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 933de6fe07..a0545e3da5 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5522,9 +5522,9 @@ FROM pg_stat_get_backend_idset() AS backendid;
<note>
<para>
- Note that when <command>ANALYZE</command> is run on a partitioned table,
- all of its partitions are also recursively analyzed.
- In that case, <command>ANALYZE</command>
+ Note that when <command>ANALYZE</command> is run on a partitioned table
+ without the <literal>ONLY</literal> keyword, all of its partitions are
+ also recursively analyzed. In that case, <command>ANALYZE</command>
progress is reported first for the parent table, whereby its inheritance
statistics are collected, followed by that for each partition.
</para>
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 2b94b378e9..dfd8779be7 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -31,7 +31,7 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -142,9 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If the specified table is a partitioned table, both the
- inheritance statistics of the partitioned table as a whole and
- statistics of the individual partitions are updated.
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are analyzed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
@@ -295,11 +298,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
- sampling rows from all partitions; in addition, it will recurse into each
- partition and update its statistics. Each leaf partition is analyzed only
- once, even with multi-level partitioning. No statistics are collected for
- only the parent table (without data from its partitions), because with
- partitioning it's guaranteed to be empty.
+ sampling rows from all partitions. If <literal>ONLY</literal> is not
+ specified, it will also recurse into each partition and update its
+ statistics. Each leaf partition is analyzed only once, even with
+ multi-level partitioning. No statistics are collected for only the
+ parent table (without data from its partitions), because with
+ partitioning it is guaranteed to be empty.
</para>
<para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9857b35627..465f9fecf7 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -42,7 +42,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -401,8 +401,12 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If the specified table is a partitioned
- table, all of its leaf partitions are vacuumed.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified before
+ the table name, only that table is vacuumed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are also vacuumed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..951e00acee 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -851,7 +851,8 @@ vacuum_open_relation(Oid relid, RangeVar *relation, bits32 options,
/*
* Given a VacuumRelation, fill in the table OID if it wasn't specified,
- * and optionally add VacuumRelations for partitions of the table.
+ * and optionally add VacuumRelations for partitions or descendant tables
+ * of the table.
*
* If a VacuumRelation does not have an OID supplied and is a partitioned
* table, an extra entry will be added to the output for each partition.
@@ -879,11 +880,15 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
}
else
{
- /* Process a specific relation, and possibly partitions thereof */
+ /*
+ * Process a specific relation, and possibly partitions and/or child
+ * tables thereof
+ */
Oid relid;
HeapTuple tuple;
Form_pg_class classForm;
- bool include_parts;
+ bool include_children;
+ bool is_partitioned_table;
int rvr_opts;
/*
@@ -944,20 +949,32 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
MemoryContextSwitchTo(oldcontext);
}
+ /*
+ * Vacuuming a partitioned table with ONLY will not do anything since
+ * the partitioned table itself is empty. Issue a warning if the user
+ * requests this.
+ */
+ include_children = vrel->relation->inh;
+ is_partitioned_table = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && !include_children)
+ ereport(WARNING,
+ (errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
+ vrel->relation->relname)));
+
- include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
ReleaseSysCache(tuple);
/*
- * If it is, make relation list entries for its partitions. Note that
- * the list returned by find_all_inheritors() includes the passed-in
- * OID, so we have to skip that. There's no point in taking locks on
- * the individual partitions yet, and doing so would just add
+ * Unless the user has specified ONLY, make relation list entries for
+ * its partitions and/or descendant tables. Note that the list
+ * returned by find_all_inheritors() includes the passed-in OID, so we
+ * have to skip that. There's no point in taking locks on the
+ * individual partitions/tables yet, and doing so would just add
* unnecessary deadlock risk. For this last reason we do not check
- * yet the ownership of the partitions, which get added to the list to
- * process. Ownership will be checked later on anyway.
+ * yet the ownership of the partitions/tables, which get added to the
+ * list to process. Ownership will be checked later on anyway.
*/
- if (include_parts)
+ if (include_children)
{
List *part_oids = find_all_inheritors(relid, NoLock, NULL);
ListCell *part_lc;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 84cef57a70..99d4ece308 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -11886,7 +11886,7 @@ opt_name_list:
;
vacuum_relation:
- qualified_name opt_name_list
+ relation_expr opt_name_list
{
$$ = (Node *) makeVacuumRelation($1, InvalidOid, $2);
}
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 2eba712887..7165235ece 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,6 +291,86 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | f | f
+(2 rows)
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | t | f
+(2 rows)
+
+VACUUM ONLY vacparted; -- gives warning
+WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | f | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | t
+(2 rows)
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | t
+ only_inh_parent | t | t
+(2 rows)
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ERROR: relation "does_not_exist" does not exist
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 548cd7acca..fc84df0f07 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,6 +233,58 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+VACUUM ONLY vacparted; -- gives warning
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ANALYZE (nonexistent-arg) does_not_exist;
--
2.31.1
On 2024-09-11 16:47, Michael Harris wrote:
Thanks for the feedback.
On Tue, 10 Sept 2024 at 22:03, torikoshia <torikoshia@oss.nttdata.com>
wrote:Regarding the addition of partition descendant tables, should we also
update the below comment on expand_vacuum_rel? Currently it refers
only
partitions:| * Given a VacuumRelation, fill in the table OID if it wasn't
specified,
| * and optionally add VacuumRelations for partitions of the table.Well spotted! I have updated the comment to read:
* Given a VacuumRelation, fill in the table OID if it wasn't
specified,
* and optionally add VacuumRelations for partitions or descendant
tables
* of the table.
Thanks for the update!
I've switched the status on the commitfest to 'ready for committer'.
--
Regards,
--
Atsushi Torikoshi
NTT DATA Group Corporation
hi.
in https://www.postgresql.org/docs/current/ddl-inherit.html
<<<
Commands that do database maintenance and tuning (e.g., REINDEX,
VACUUM) typically only work on individual, physical tables and do not
support recursing over inheritance hierarchies. The respective
behavior of each individual command is documented in its reference
page (SQL Commands).
<<<
does the above para need some tweaks?
in section, your patch output is
<<<<<<<<<
and table_and_columns is:
[ ONLY ] table_name [ * ] [ ( column_name [, ...] ) ]
<<<<<<<<<
ANALYZE ONLY only_parted*;
will fail.
Maybe we need a sentence saying ONLY and * are mutually exclusive?
If the table being analyzed has inheritance children, ANALYZE gathers
two sets of statistics: one on the rows of the parent table only, and
a second including rows of both the parent table and all of its
children. This second set of statistics is needed when planning
queries that process the inheritance tree as a whole. The child tables
themselves are not individually analyzed in this case.
is this still true for table inheritance.
for example:
drop table if exists only_inh_parent,only_inh_child;
CREATE TABLE only_inh_parent (a int , b INT) with (autovacuum_enabled = false);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent) with
(autovacuum_enabled = false);
INSERT INTO only_inh_child(a,b) select g % 80, (g + 1) % 200 from
generate_series(1,1000) g;
select pg_table_size('only_inh_parent');
ANALYZE ONLY only_inh_parent;
here, will "ANALYZE ONLY only_inh_parent;" will have minimal "effects".
since the physical table only_inh_parent has zero storage.
but
select stadistinct,starelid::regclass,staattnum, stainherit
from pg_statistic
where starelid = ANY ('{only_inh_parent, only_inh_child}'::regclass[]);
output is
stadistinct | starelid | staattnum | stainherit
-------------+-----------------+-----------+------------
80 | only_inh_parent | 1 | t
-0.2 | only_inh_parent | 2 | t
I am not sure if this output and the manual description about table
inheritance is consistent.
Thanks for the feedback, and sorry it has taken a few days to respond.
On Mon, 16 Sept 2024 at 12:29, jian he <jian.universality@gmail.com> wrote:
in https://www.postgresql.org/docs/current/ddl-inherit.html
<<<
Commands that do database maintenance and tuning (e.g., REINDEX,
VACUUM) typically only work on individual, physical tables and do not
support recursing over inheritance hierarchies. The respective
behavior of each individual command is documented in its reference
page (SQL Commands).
<<<
does the above para need some tweaks?
Yes, good catch.
ANALYZE ONLY only_parted*;
will fail.
Maybe we need a sentence saying ONLY and * are mutually exclusive?
I used the same text that appears on other pages that are describing the
operation of ONLY / *, eg. the page for TRUNCATE
(https://www.postgresql.org/docs/current/sql-truncate.html).
I think it would be good to keep them all consistent if possible.
If the table being analyzed has inheritance children, ANALYZE gathers
two sets of statistics: one on the rows of the parent table only, and
a second including rows of both the parent table and all of its
children. This second set of statistics is needed when planning
queries that process the inheritance tree as a whole. The child tables
themselves are not individually analyzed in this case.is this still true for table inheritance.
The way I interpret this is that when you analyze an inheritance parent,
it will sample the rows of the parent and all its children. It will
prepare two sets
of stats, one based on the samples of any rows in the parent itself,
the other based on samples of rows in both parent and child tables.
This is distinct from the activity of updating the stats on the child
tables themselves.
I am not sure if this output and the manual description about table
inheritance is consistent.
With the test you performed, the result was a set of statistics for the
parent table which included samples from the child tables
(stainherit=t) but no entries for the parent table only
(stainherit=f) as there is no data in the parent table itself.
There are no statistics for only_inh_child.
If you add some records to the parent table and re-analyze, you
do get statistics with stainherit=f:
postgres=# insert into only_inh_parent(a,b) select g % 10, 100 from
generate_series(1,1000) g;
INSERT 0 1000
postgres=# select pg_table_size('only_inh_parent');
pg_table_size
---------------
65536
(1 row)
postgres=# ANALYZE ONLY only_inh_parent;
ANALYZE
postgres=# select stadistinct,starelid::regclass,staattnum, stainherit
from pg_statistic
where starelid = ANY ('{only_inh_parent, only_inh_child}'::regclass[]);
stadistinct | starelid | staattnum | stainherit
-------------+-----------------+-----------+------------
80 | only_inh_parent | 1 | t
200 | only_inh_parent | 2 | t
10 | only_inh_parent | 1 | f
1 | only_inh_parent | 2 | f
(4 rows)
and if you perform ANALYZE without ONLY:
postgres=# ANALYZE only_inh_parent;
ANALYZE
postgres=# select stadistinct,starelid::regclass,staattnum, stainherit
from pg_statistic
where starelid = ANY ('{only_inh_parent, only_inh_child}'::regclass[]);
stadistinct | starelid | staattnum | stainherit
-------------+-----------------+-----------+------------
80 | only_inh_parent | 1 | t
10 | only_inh_parent | 1 | f
1 | only_inh_parent | 2 | f
200 | only_inh_parent | 2 | t
80 | only_inh_child | 1 | f
-0.2 | only_inh_child | 2 | f
(6 rows)
Now we also have statistics for only_inh_child.
At least for me, that is consistent with the para you
quoted, except perhaps for this sentence:
The child tables themselves are not individually analyzed in this case.
This should probably read:
If the ONLY keyword is used, the child tables themselves are not individually analyzed.
I have attached a new version of the patch with this feedback incorporated.
Thanks again!
Cheers
Mike.
Attachments:
v6-0001-Implementation-of-the-ONLY-feature.patchapplication/octet-stream; name=v6-0001-Implementation-of-the-ONLY-feature.patchDownload
From eff0406d5ca8132e8a199ced72b8a0cc3b84ea22 Mon Sep 17 00:00:00 2001
From: Michael Harris <harmic@gmail.com>
Date: Fri, 20 Sep 2024 11:17:07 +1000
Subject: [PATCH v6] Implementation of the 'ONLY' feature for ANALYZE and
VACUUM.
This patch also changes the behaivour of ANALYZE and
VACUUM when given on parent tables in an inheritance
structure, to align with the case of partitioned tables.
In the absence of ONLY, descendant tables will be
included.
Refer to the thread at:
https://www.postgresql.org/message-id/flat/CADofcAWATx_haD%3DQkSxHbnTsAe6%2Be0Aw8Eh4H8cXyogGvn_kOg%40mail.gmail.com
---
doc/src/sgml/ddl.sgml | 22 ++++----
doc/src/sgml/monitoring.sgml | 6 +--
doc/src/sgml/ref/analyze.sgml | 36 +++++++------
doc/src/sgml/ref/vacuum.sgml | 10 ++--
src/backend/commands/vacuum.c | 39 ++++++++++----
src/backend/parser/gram.y | 2 +-
src/test/regress/expected/vacuum.out | 80 ++++++++++++++++++++++++++++
src/test/regress/sql/vacuum.sql | 52 ++++++++++++++++++
8 files changed, 203 insertions(+), 44 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b671858627..c7de29ae71 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3778,12 +3778,14 @@ VALUES ('Albany', NULL, NULL, 'NY');
not <literal>INSERT</literal> or <literal>ALTER TABLE ...
RENAME</literal>) typically default to including child tables and
support the <literal>ONLY</literal> notation to exclude them.
- Commands that do database maintenance and tuning
- (e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
- typically only work on individual, physical tables and do not
- support recursing over inheritance hierarchies. The respective
- behavior of each individual command is documented in its reference
- page (<xref linkend="sql-commands"/>).
+ Some commands that do database maintenance and tuning
+ (e.g., <literal>REINDEX</literal>) only work on individual, physical
+ tables and do not support recursing over inheritance hierarchies,
+ while other maintenance commands (e.g., <literal>VACUUM</literal>,
+ <literal>ANALYZE</literal>) default to including child tables
+ and support the <literal>ONLY</literal> notation to exclude them.
+ The respective behavior of each individual command is documented
+ in its reference page (<xref linkend="sql-commands"/>).
</para>
<para>
@@ -4854,11 +4856,11 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<listitem>
<para>
- If you are using manual <command>VACUUM</command> or
- <command>ANALYZE</command> commands, don't forget that
- you need to run them on each child table individually. A command like:
+ Manual <command>VACUUM</command> or <command>ANALYZE</command> commands
+ will automatically process all descendant tables. If this is undesirable,
+ you can use the <literal>ONLY</literal> keyword. A command like:
<programlisting>
-ANALYZE measurement;
+ANALYZE ONLY measurement;
</programlisting>
will only process the root table.
</para>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index a2fda4677d..82d6aef858 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5525,9 +5525,9 @@ FROM pg_stat_get_backend_idset() AS backendid;
<note>
<para>
- Note that when <command>ANALYZE</command> is run on a partitioned table,
- all of its partitions are also recursively analyzed.
- In that case, <command>ANALYZE</command>
+ Note that when <command>ANALYZE</command> is run on a partitioned table
+ without the <literal>ONLY</literal> keyword, all of its partitions are
+ also recursively analyzed. In that case, <command>ANALYZE</command>
progress is reported first for the parent table, whereby its inheritance
statistics are collected, followed by that for each partition.
</para>
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 2b94b378e9..92e465ca59 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -31,7 +31,7 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -142,9 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If the specified table is a partitioned table, both the
- inheritance statistics of the partitioned table as a whole and
- statistics of the individual partitions are updated.
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are analyzed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
@@ -284,22 +287,23 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<command>ANALYZE</command> gathers two sets of statistics: one on the rows
of the parent table only, and a second including rows of both the parent
table and all of its children. This second set of statistics is needed when
- planning queries that process the inheritance tree as a whole. The child
- tables themselves are not individually analyzed in this case.
- The autovacuum daemon, however, will only consider inserts or
- updates on the parent table itself when deciding whether to trigger an
- automatic analyze for that table. If that table is rarely inserted into
- or updated, the inheritance statistics will not be up to date unless you
- run <command>ANALYZE</command> manually.
+ planning queries that process the inheritance tree as a whole. If the
+ <literal>ONLY</literal> keyword is used, child tables themselves are not
+ individually analyzed. The autovacuum daemon, however, will only consider
+ inserts or updates on the parent table itself when deciding whether to
+ trigger an automatic analyze for that table. If that table is rarely
+ inserted into or updated, the inheritance statistics will not be up to
+ date unless you run <command>ANALYZE</command> manually.
</para>
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
- sampling rows from all partitions; in addition, it will recurse into each
- partition and update its statistics. Each leaf partition is analyzed only
- once, even with multi-level partitioning. No statistics are collected for
- only the parent table (without data from its partitions), because with
- partitioning it's guaranteed to be empty.
+ sampling rows from all partitions. If <literal>ONLY</literal> is not
+ specified, it will also recurse into each partition and update its
+ statistics. Each leaf partition is analyzed only once, even with
+ multi-level partitioning. No statistics are collected for only the
+ parent table (without data from its partitions), because with
+ partitioning it is guaranteed to be empty.
</para>
<para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9857b35627..465f9fecf7 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -42,7 +42,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -401,8 +401,12 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If the specified table is a partitioned
- table, all of its leaf partitions are vacuumed.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified before
+ the table name, only that table is vacuumed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are also vacuumed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..951e00acee 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -851,7 +851,8 @@ vacuum_open_relation(Oid relid, RangeVar *relation, bits32 options,
/*
* Given a VacuumRelation, fill in the table OID if it wasn't specified,
- * and optionally add VacuumRelations for partitions of the table.
+ * and optionally add VacuumRelations for partitions or descendant tables
+ * of the table.
*
* If a VacuumRelation does not have an OID supplied and is a partitioned
* table, an extra entry will be added to the output for each partition.
@@ -879,11 +880,15 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
}
else
{
- /* Process a specific relation, and possibly partitions thereof */
+ /*
+ * Process a specific relation, and possibly partitions and/or child
+ * tables thereof
+ */
Oid relid;
HeapTuple tuple;
Form_pg_class classForm;
- bool include_parts;
+ bool include_children;
+ bool is_partitioned_table;
int rvr_opts;
/*
@@ -944,20 +949,32 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
MemoryContextSwitchTo(oldcontext);
}
+ /*
+ * Vacuuming a partitioned table with ONLY will not do anything since
+ * the partitioned table itself is empty. Issue a warning if the user
+ * requests this.
+ */
+ include_children = vrel->relation->inh;
+ is_partitioned_table = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && !include_children)
+ ereport(WARNING,
+ (errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
+ vrel->relation->relname)));
+
- include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
ReleaseSysCache(tuple);
/*
- * If it is, make relation list entries for its partitions. Note that
- * the list returned by find_all_inheritors() includes the passed-in
- * OID, so we have to skip that. There's no point in taking locks on
- * the individual partitions yet, and doing so would just add
+ * Unless the user has specified ONLY, make relation list entries for
+ * its partitions and/or descendant tables. Note that the list
+ * returned by find_all_inheritors() includes the passed-in OID, so we
+ * have to skip that. There's no point in taking locks on the
+ * individual partitions/tables yet, and doing so would just add
* unnecessary deadlock risk. For this last reason we do not check
- * yet the ownership of the partitions, which get added to the list to
- * process. Ownership will be checked later on anyway.
+ * yet the ownership of the partitions/tables, which get added to the
+ * list to process. Ownership will be checked later on anyway.
*/
- if (include_parts)
+ if (include_children)
{
List *part_oids = find_all_inheritors(relid, NoLock, NULL);
ListCell *part_lc;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ab304ca989..b1d4642c59 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -11921,7 +11921,7 @@ opt_name_list:
;
vacuum_relation:
- qualified_name opt_name_list
+ relation_expr opt_name_list
{
$$ = (Node *) makeVacuumRelation($1, InvalidOid, $2);
}
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 2eba712887..7165235ece 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,6 +291,86 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | f | f
+(2 rows)
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | t | f
+(2 rows)
+
+VACUUM ONLY vacparted; -- gives warning
+WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | f | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | t
+(2 rows)
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | t
+ only_inh_parent | t | t
+(2 rows)
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ERROR: relation "does_not_exist" does not exist
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 548cd7acca..fc84df0f07 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,6 +233,58 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+VACUUM ONLY vacparted; -- gives warning
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ANALYZE (nonexistent-arg) does_not_exist;
--
2.31.1
On Fri, 20 Sept 2024 at 13:20, Michael Harris <harmic@gmail.com> wrote:
I have attached a new version of the patch with this feedback incorporated.
I looked over the v6 patch and I don't have any complaints. However, I
did make some minor adjustments:
* A few places said things like "and possibly partitions and/or child
tables thereof". It's not possible for a partition to have inheritance
children, so "and/" is wrong there, only "or" is possible.
* I spent a bit of time trying to massage the new text into the
documents. I just felt that sometimes the new text was a little bit
clumsy, for example:
<command>ANALYZE</command> gathers two sets of statistics: one on the rows
of the parent table only, and a second including rows of both the parent
table and all of its children. This second set of statistics is
needed when
- planning queries that process the inheritance tree as a whole. The child
- tables themselves are not individually analyzed in this case.
- The autovacuum daemon, however, will only consider inserts or
- updates on the parent table itself when deciding whether to trigger an
- automatic analyze for that table. If that table is rarely inserted into
- or updated, the inheritance statistics will not be up to date unless you
- run <command>ANALYZE</command> manually.
+ planning queries that process the inheritance tree as a whole. If the
+ <literal>ONLY</literal> keyword is used, child tables themselves are not
+ individually analyzed. The autovacuum daemon, however, will only consider
+ inserts or updates on the parent table itself when deciding whether to
+ trigger an automatic analyze for that table. If that table is rarely
+ inserted into or updated, the inheritance statistics will not be up to
+ date unless you run <command>ANALYZE</command> manually.
I kinda blame the existing text which reads "The child tables
themselves are not individually analyzed in this case." as that seems
to have led you to detail the new behaviour in the same place, but I
think that we really need to finish talking about autovacuum maybe not
analyzing the inheritance parent unless it receives enough changes
itself before we talk about what ONLY does.
* Very minor adjustments to the tests to upper case all the keywords
"is not null as" was all lowercase. I wrapped some longer lines too.
Also, I made some comment adjustments and I dropped the partitioned
table directly after we're done with it instead of waiting until after
the inheritance parent tests.
* A bunch of uses of "descendant tables" when you meant "inheritance children"
v7-0001 is the same as your v6 patch, but I adjusted the commit
message, which I'm happy to take feedback on. Nobody I've spoken to
seems to be concerned about VACUUM on inheritance parents now being
recursive by default. I included "release notes" and "incompatibility"
in the commit message in the hope that Bruce will stumble upon it and
write about this when he does the release notes for v18.
v7-0002 is all my changes.
I'd like to push this soon, so if anyone has any last-minute feedback,
please let me know in the next few days.
David
Attachments:
v7-0001-Add-ONLY-support-for-VACUUM-and-ANALYZE.patchapplication/octet-stream; name=v7-0001-Add-ONLY-support-for-VACUUM-and-ANALYZE.patchDownload
From debf2bbe6007f0e913637a0a736819e58369c884 Mon Sep 17 00:00:00 2001
From: Michael Harris <harmic@gmail.com>
Date: Fri, 20 Sep 2024 11:17:07 +1000
Subject: [PATCH v7 1/2] Add ONLY support for VACUUM and ANALYZE
Since autovacuum does not trigger an ANALYZE for partitioned tables,
users must perform these manually. However, performing a manual ANALYZE
on a partitioned table would always result in recursively analyzing each
partition and that could be undesirable as autovacuum takes care of that.
For partitioned tables that contain a large number of partitions, having
to analyze each partition could take an unreasonably long time, especially
so for tables with a large number of columns.
Here we allow the ONLY keyword to prefix the name of the table to allow
users to have ANALYZE skip processing partitions. This option can also
be used with VACUUM, but there is no work to do if VACUUM ONLY is used on
a partitioned table.
This commit also changes the behavior of VACUUM and ANALYZE for
inheritance parents. Previously inheritance child tables would not be
processed when operating on the parent. Now, by default we *do* operate
on the child tables. ONLY can be used to obtain the old behavior.
The release notes should note this as an incompatibility. The default
behavior has not changed for partitioned tables as these always
recursively processed the partitions.
Author: Michael Harris <harmic@gmail.com>
Discussion: https://postgr.es/m/CADofcAWATx_haD=QkSxHbnTsAe6+e0Aw8Eh4H8cXyogGvn_kOg@mail.gmail.com
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
---
doc/src/sgml/ddl.sgml | 22 ++++----
doc/src/sgml/monitoring.sgml | 6 +--
doc/src/sgml/ref/analyze.sgml | 36 +++++++------
doc/src/sgml/ref/vacuum.sgml | 10 ++--
src/backend/commands/vacuum.c | 39 ++++++++++----
src/backend/parser/gram.y | 2 +-
src/test/regress/expected/vacuum.out | 80 ++++++++++++++++++++++++++++
src/test/regress/sql/vacuum.sql | 52 ++++++++++++++++++
8 files changed, 203 insertions(+), 44 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b671858627..c7de29ae71 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3778,12 +3778,14 @@ VALUES ('Albany', NULL, NULL, 'NY');
not <literal>INSERT</literal> or <literal>ALTER TABLE ...
RENAME</literal>) typically default to including child tables and
support the <literal>ONLY</literal> notation to exclude them.
- Commands that do database maintenance and tuning
- (e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
- typically only work on individual, physical tables and do not
- support recursing over inheritance hierarchies. The respective
- behavior of each individual command is documented in its reference
- page (<xref linkend="sql-commands"/>).
+ Some commands that do database maintenance and tuning
+ (e.g., <literal>REINDEX</literal>) only work on individual, physical
+ tables and do not support recursing over inheritance hierarchies,
+ while other maintenance commands (e.g., <literal>VACUUM</literal>,
+ <literal>ANALYZE</literal>) default to including child tables
+ and support the <literal>ONLY</literal> notation to exclude them.
+ The respective behavior of each individual command is documented
+ in its reference page (<xref linkend="sql-commands"/>).
</para>
<para>
@@ -4854,11 +4856,11 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<listitem>
<para>
- If you are using manual <command>VACUUM</command> or
- <command>ANALYZE</command> commands, don't forget that
- you need to run them on each child table individually. A command like:
+ Manual <command>VACUUM</command> or <command>ANALYZE</command> commands
+ will automatically process all descendant tables. If this is undesirable,
+ you can use the <literal>ONLY</literal> keyword. A command like:
<programlisting>
-ANALYZE measurement;
+ANALYZE ONLY measurement;
</programlisting>
will only process the root table.
</para>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index a2fda4677d..82d6aef858 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5525,9 +5525,9 @@ FROM pg_stat_get_backend_idset() AS backendid;
<note>
<para>
- Note that when <command>ANALYZE</command> is run on a partitioned table,
- all of its partitions are also recursively analyzed.
- In that case, <command>ANALYZE</command>
+ Note that when <command>ANALYZE</command> is run on a partitioned table
+ without the <literal>ONLY</literal> keyword, all of its partitions are
+ also recursively analyzed. In that case, <command>ANALYZE</command>
progress is reported first for the parent table, whereby its inheritance
statistics are collected, followed by that for each partition.
</para>
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 2b94b378e9..92e465ca59 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -31,7 +31,7 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -142,9 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If the specified table is a partitioned table, both the
- inheritance statistics of the partitioned table as a whole and
- statistics of the individual partitions are updated.
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are analyzed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
@@ -284,22 +287,23 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<command>ANALYZE</command> gathers two sets of statistics: one on the rows
of the parent table only, and a second including rows of both the parent
table and all of its children. This second set of statistics is needed when
- planning queries that process the inheritance tree as a whole. The child
- tables themselves are not individually analyzed in this case.
- The autovacuum daemon, however, will only consider inserts or
- updates on the parent table itself when deciding whether to trigger an
- automatic analyze for that table. If that table is rarely inserted into
- or updated, the inheritance statistics will not be up to date unless you
- run <command>ANALYZE</command> manually.
+ planning queries that process the inheritance tree as a whole. If the
+ <literal>ONLY</literal> keyword is used, child tables themselves are not
+ individually analyzed. The autovacuum daemon, however, will only consider
+ inserts or updates on the parent table itself when deciding whether to
+ trigger an automatic analyze for that table. If that table is rarely
+ inserted into or updated, the inheritance statistics will not be up to
+ date unless you run <command>ANALYZE</command> manually.
</para>
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
- sampling rows from all partitions; in addition, it will recurse into each
- partition and update its statistics. Each leaf partition is analyzed only
- once, even with multi-level partitioning. No statistics are collected for
- only the parent table (without data from its partitions), because with
- partitioning it's guaranteed to be empty.
+ sampling rows from all partitions. If <literal>ONLY</literal> is not
+ specified, it will also recurse into each partition and update its
+ statistics. Each leaf partition is analyzed only once, even with
+ multi-level partitioning. No statistics are collected for only the
+ parent table (without data from its partitions), because with
+ partitioning it is guaranteed to be empty.
</para>
<para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9857b35627..465f9fecf7 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -42,7 +42,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -401,8 +401,12 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If the specified table is a partitioned
- table, all of its leaf partitions are vacuumed.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified before
+ the table name, only that table is vacuumed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are also vacuumed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..951e00acee 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -851,7 +851,8 @@ vacuum_open_relation(Oid relid, RangeVar *relation, bits32 options,
/*
* Given a VacuumRelation, fill in the table OID if it wasn't specified,
- * and optionally add VacuumRelations for partitions of the table.
+ * and optionally add VacuumRelations for partitions or descendant tables
+ * of the table.
*
* If a VacuumRelation does not have an OID supplied and is a partitioned
* table, an extra entry will be added to the output for each partition.
@@ -879,11 +880,15 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
}
else
{
- /* Process a specific relation, and possibly partitions thereof */
+ /*
+ * Process a specific relation, and possibly partitions and/or child
+ * tables thereof
+ */
Oid relid;
HeapTuple tuple;
Form_pg_class classForm;
- bool include_parts;
+ bool include_children;
+ bool is_partitioned_table;
int rvr_opts;
/*
@@ -944,20 +949,32 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
MemoryContextSwitchTo(oldcontext);
}
+ /*
+ * Vacuuming a partitioned table with ONLY will not do anything since
+ * the partitioned table itself is empty. Issue a warning if the user
+ * requests this.
+ */
+ include_children = vrel->relation->inh;
+ is_partitioned_table = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && !include_children)
+ ereport(WARNING,
+ (errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
+ vrel->relation->relname)));
+
- include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
ReleaseSysCache(tuple);
/*
- * If it is, make relation list entries for its partitions. Note that
- * the list returned by find_all_inheritors() includes the passed-in
- * OID, so we have to skip that. There's no point in taking locks on
- * the individual partitions yet, and doing so would just add
+ * Unless the user has specified ONLY, make relation list entries for
+ * its partitions and/or descendant tables. Note that the list
+ * returned by find_all_inheritors() includes the passed-in OID, so we
+ * have to skip that. There's no point in taking locks on the
+ * individual partitions/tables yet, and doing so would just add
* unnecessary deadlock risk. For this last reason we do not check
- * yet the ownership of the partitions, which get added to the list to
- * process. Ownership will be checked later on anyway.
+ * yet the ownership of the partitions/tables, which get added to the
+ * list to process. Ownership will be checked later on anyway.
*/
- if (include_parts)
+ if (include_children)
{
List *part_oids = find_all_inheritors(relid, NoLock, NULL);
ListCell *part_lc;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ab304ca989..b1d4642c59 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -11921,7 +11921,7 @@ opt_name_list:
;
vacuum_relation:
- qualified_name opt_name_list
+ relation_expr opt_name_list
{
$$ = (Node *) makeVacuumRelation($1, InvalidOid, $2);
}
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 2eba712887..7165235ece 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,6 +291,86 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | f | f
+(2 rows)
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | t | f
+(2 rows)
+
+VACUUM ONLY vacparted; -- gives warning
+WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | f | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | t
+(2 rows)
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | t
+ only_inh_parent | t | t
+(2 rows)
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ERROR: relation "does_not_exist" does not exist
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 548cd7acca..fc84df0f07 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,6 +233,58 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+VACUUM ONLY vacparted; -- gives warning
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ANALYZE (nonexistent-arg) does_not_exist;
--
2.34.1
v7-0002-fixup-Implementation-of-the-ONLY-feature-for-ANAL.patchapplication/octet-stream; name=v7-0002-fixup-Implementation-of-the-ONLY-feature-for-ANAL.patchDownload
From a9ead4e7073059b2da51b461097e27b6b1a37076 Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Sun, 22 Sep 2024 23:56:39 +1200
Subject: [PATCH v7 2/2] fixup! Implementation of the 'ONLY' feature for
ANALYZE and VACUUM.
---
doc/src/sgml/ddl.sgml | 21 +++++-----
doc/src/sgml/monitoring.sgml | 2 +-
doc/src/sgml/ref/analyze.sgml | 39 +++++++++--------
doc/src/sgml/ref/vacuum.sgml | 13 +++---
src/backend/commands/vacuum.c | 12 +++---
src/test/regress/expected/vacuum.out | 60 ++++++++++++++++-----------
src/test/regress/sql/vacuum.sql | 62 +++++++++++++++++-----------
7 files changed, 119 insertions(+), 90 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index c7de29ae71..8ab0ddb112 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3778,14 +3778,14 @@ VALUES ('Albany', NULL, NULL, 'NY');
not <literal>INSERT</literal> or <literal>ALTER TABLE ...
RENAME</literal>) typically default to including child tables and
support the <literal>ONLY</literal> notation to exclude them.
- Some commands that do database maintenance and tuning
+ The majority of commands that do database maintenance and tuning
(e.g., <literal>REINDEX</literal>) only work on individual, physical
- tables and do not support recursing over inheritance hierarchies,
- while other maintenance commands (e.g., <literal>VACUUM</literal>,
- <literal>ANALYZE</literal>) default to including child tables
- and support the <literal>ONLY</literal> notation to exclude them.
- The respective behavior of each individual command is documented
- in its reference page (<xref linkend="sql-commands"/>).
+ tables and do not support recursing over inheritance hierarchies.
+ However, both <literal>VACUUM</literal> and <literal>ANALYZE</literal>
+ commands default to including child tables and the <literal>ONLY</literal>
+ notation is supported to allow them to be excluded. The respective
+ behavior of each individual command is documented in its reference page
+ (<xref linkend="sql-commands"/>).
</para>
<para>
@@ -4856,9 +4856,10 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<listitem>
<para>
- Manual <command>VACUUM</command> or <command>ANALYZE</command> commands
- will automatically process all descendant tables. If this is undesirable,
- you can use the <literal>ONLY</literal> keyword. A command like:
+ Manual <command>VACUUM</command> and <command>ANALYZE</command>
+ commands will automatically process all inheritance child tables. If
+ this is undesirable, you can use the <literal>ONLY</literal> keyword.
+ A command like:
<programlisting>
ANALYZE ONLY measurement;
</programlisting>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 82d6aef858..db7f35a451 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5527,7 +5527,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
<para>
Note that when <command>ANALYZE</command> is run on a partitioned table
without the <literal>ONLY</literal> keyword, all of its partitions are
- also recursively analyzed. In that case, <command>ANALYZE</command>
+ also recursively analyzed. In that case, <command>ANALYZE</command>
progress is reported first for the parent table, whereby its inheritance
statistics are collected, followed by that for each partition.
</para>
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 92e465ca59..78a565f792 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -142,12 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If <literal>ONLY</literal> is specified before
- the table name, only that table is analyzed. If <literal>ONLY</literal>
- is not specified, the table and all its descendant tables or partitions
- (if any) are analyzed. Optionally, <literal>*</literal>
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its inheritance child tables or
+ partitions (if any) are analyzed. Optionally, <literal>*</literal>
can be specified after the table name to explicitly indicate that
- descendant tables are included.
+ inheritance child tables (or partitions) are to be analyzed.
</para>
</listitem>
</varlistentry>
@@ -287,23 +287,26 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<command>ANALYZE</command> gathers two sets of statistics: one on the rows
of the parent table only, and a second including rows of both the parent
table and all of its children. This second set of statistics is needed when
- planning queries that process the inheritance tree as a whole. If the
- <literal>ONLY</literal> keyword is used, child tables themselves are not
- individually analyzed. The autovacuum daemon, however, will only consider
- inserts or updates on the parent table itself when deciding whether to
- trigger an automatic analyze for that table. If that table is rarely
- inserted into or updated, the inheritance statistics will not be up to
- date unless you run <command>ANALYZE</command> manually.
+ planning queries that process the inheritance tree as a whole. The
+ autovacuum daemon, however, will only consider inserts or updates on the
+ parent table itself when deciding whether to trigger an automatic analyze
+ for that table. If that table is rarely inserted into or updated, the
+ inheritance statistics will not be up to date unless you run
+ <command>ANALYZE</command> manually. By default,
+ <command>ANALYZE</command> will also recursively collect statistics for
+ each inheritance child table. The <literal>ONLY</literal> keyword may be
+ used to disable this.
</para>
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
- sampling rows from all partitions. If <literal>ONLY</literal> is not
- specified, it will also recurse into each partition and update its
- statistics. Each leaf partition is analyzed only once, even with
- multi-level partitioning. No statistics are collected for only the
- parent table (without data from its partitions), because with
- partitioning it is guaranteed to be empty.
+ sampling rows from all partitions. Each leaf partition is also
+ recursively analyzed and the statistics updated. This recursive part
+ may be disabled by using the <literal>ONLY</literal> keyword, otherwise,
+ leaf partitions are analyzed only once, even with multi-level
+ partitioning. No statistics are collected for only the parent table
+ (without data from its partitions), because with partitioning it's
+ guaranteed to be empty.
</para>
<para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 465f9fecf7..9110938fab 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -401,12 +401,13 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If <literal>ONLY</literal> is specified before
- the table name, only that table is vacuumed. If <literal>ONLY</literal>
- is not specified, the table and all its descendant tables or partitions
- (if any) are also vacuumed. Optionally, <literal>*</literal>
- can be specified after the table name to explicitly indicate that
- descendant tables are included.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified
+ before the table name, only that table is vacuumed. If
+ <literal>ONLY</literal> is not specified, the table and all its
+ inheritance child tables or partitions (if any) are also vacuumed.
+ Optionally, <literal>*</literal> can be specified after the table name
+ to explicitly indicate that inheritance child tables (or partitions) are
+ to be vacuumed.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 951e00acee..bcadca4214 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -851,8 +851,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, bits32 options,
/*
* Given a VacuumRelation, fill in the table OID if it wasn't specified,
- * and optionally add VacuumRelations for partitions or descendant tables
- * of the table.
+ * and optionally add VacuumRelations for partitions or inheritance children.
*
* If a VacuumRelation does not have an OID supplied and is a partitioned
* table, an extra entry will be added to the output for each partition.
@@ -881,8 +880,8 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
else
{
/*
- * Process a specific relation, and possibly partitions and/or child
- * tables thereof
+ * Process a specific relation, and possibly partitions or child
+ * tables thereof.
*/
Oid relid;
HeapTuple tuple;
@@ -951,7 +950,7 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
/*
* Vacuuming a partitioned table with ONLY will not do anything since
- * the partitioned table itself is empty. Issue a warning if the user
+ * the partitioned table itself is empty. Issue a warning if the user
* requests this.
*/
include_children = vrel->relation->inh;
@@ -961,12 +960,11 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
(errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
vrel->relation->relname)));
-
ReleaseSysCache(tuple);
/*
* Unless the user has specified ONLY, make relation list entries for
- * its partitions and/or descendant tables. Note that the list
+ * its partitions or inheritance child tables. Note that the list
* returned by find_all_inheritors() includes the passed-in OID, so we
* have to skip that. There's no point in taking locks on the
* individual partitions/tables yet, and doing so would just add
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 7165235ece..709d53047c 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,14 +291,17 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
--- ANALYZE ONLY / VACUUM ONLY on partitioned table
-CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+--
+-- Tests for ANALYZE ONLY / VACUUM ONLY on partitioned tables
+--
+CREATE TABLE only_parted (a int, b text) PARTITION BY LIST (a);
CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
INSERT INTO only_parted VALUES (1, 'a');
--- Only partitioned table is analyzed
+-- Ensure only the partitioned table is analyzed
ANALYZE ONLY only_parted;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_parted%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_parted'::regclass, 'only_parted1'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
--------------+----------+----------
@@ -306,10 +309,11 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_parted1 | f | f
(2 rows)
--- Partitioned table and partitions are analyzed
+-- Ensure partitioned table and the partitions are analyzed
ANALYZE only_parted;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_parted%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_parted'::regclass, 'only_parted1'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
--------------+----------+----------
@@ -317,17 +321,23 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_parted1 | t | f
(2 rows)
-VACUUM ONLY vacparted; -- gives warning
+DROP TABLE only_parted;
+-- VACUUM ONLY on a partitioned table does nothing, ensure we get a warning.
+VACUUM ONLY vacparted;
WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
-ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
--- ANALYZE ONLY on inherited tables
+-- Try ANALYZE ONLY with a column list
+ANALYZE ONLY vacparted(a,b);
+--
+-- Tests for VACUUM ONLY / ANALYZE ONLY on inheritance tables
+--
CREATE TABLE only_inh_parent (a int primary key, b TEXT);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
--- Only parent is ANALYZED
+-- Ensure only parent is analyzed
ANALYZE ONLY only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
-----------------+----------+----------
@@ -335,10 +345,11 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_inh_parent | t | f
(2 rows)
--- Parent and child are ANALYZED
+-- Ensure the parent and child are analyzed
ANALYZE only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
-----------------+----------+----------
@@ -346,10 +357,11 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_inh_parent | t | f
(2 rows)
--- Only parent is VACUUMED
+-- Ensure only the parent is vacuumed
VACUUM ONLY only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
-----------------+----------+----------
@@ -357,10 +369,11 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_inh_parent | t | t
(2 rows)
--- Parent and child are VACUUMED
+-- Ensure parent and child are vacuumed
VACUUM only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
-----------------+----------+----------
@@ -368,7 +381,6 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_inh_parent | t | t
(2 rows)
-DROP TABLE only_parted CASCADE;
DROP TABLE only_inh_parent CASCADE;
NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index fc84df0f07..5e55079e71 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,56 +233,70 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
--- ANALYZE ONLY / VACUUM ONLY on partitioned table
-CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+--
+-- Tests for ANALYZE ONLY / VACUUM ONLY on partitioned tables
+--
+CREATE TABLE only_parted (a int, b text) PARTITION BY LIST (a);
CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
INSERT INTO only_parted VALUES (1, 'a');
--- Only partitioned table is analyzed
+-- Ensure only the partitioned table is analyzed
ANALYZE ONLY only_parted;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_parted%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_parted'::regclass, 'only_parted1'::regclass)
ORDER BY relname;
--- Partitioned table and partitions are analyzed
+-- Ensure partitioned table and the partitions are analyzed
ANALYZE only_parted;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_parted%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_parted'::regclass, 'only_parted1'::regclass)
ORDER BY relname;
-VACUUM ONLY vacparted; -- gives warning
-ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+DROP TABLE only_parted;
+
+-- VACUUM ONLY on a partitioned table does nothing, ensure we get a warning.
+VACUUM ONLY vacparted;
--- ANALYZE ONLY on inherited tables
+-- Try ANALYZE ONLY with a column list
+ANALYZE ONLY vacparted(a,b);
+
+--
+-- Tests for VACUUM ONLY / ANALYZE ONLY on inheritance tables
+--
CREATE TABLE only_inh_parent (a int primary key, b TEXT);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
--- Only parent is ANALYZED
+-- Ensure only parent is analyzed
ANALYZE ONLY only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
--- Parent and child are ANALYZED
+-- Ensure the parent and child are analyzed
ANALYZE only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
--- Only parent is VACUUMED
+-- Ensure only the parent is vacuumed
VACUUM ONLY only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
--- Parent and child are VACUUMED
+-- Ensure parent and child are vacuumed
VACUUM only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
-DROP TABLE only_parted CASCADE;
DROP TABLE only_inh_parent CASCADE;
-- parenthesized syntax for ANALYZE
--
2.34.1
On Sun, 22 Sept 2024 at 23:09, David Rowley <dgrowleyml@gmail.com> wrote:
I'd like to push this soon, so if anyone has any last-minute feedback,
please let me know in the next few days.
Many thanks for your updates and assistance.
Looks good. Agreed, I was probably too conservative in some of the
doc updates.
Thanks to all reviewers for helping to get it to this stage!
Cheers
Mike
On Sun, Sep 22, 2024 at 9:09 PM David Rowley <dgrowleyml@gmail.com> wrote:
v7-0002 is all my changes.
I'd like to push this soon, so if anyone has any last-minute feedback,
please let me know in the next few days.
drop table if exists only_inh_parent,only_inh_child;
CREATE TABLE only_inh_parent (a int , b INT) with (autovacuum_enabled = false);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent) with
(autovacuum_enabled = false);
INSERT INTO only_inh_child(a,b) select g % 80, (g + 1) % 200 from
generate_series(1,1000) g;
select pg_table_size('only_inh_parent');
ANALYZE ONLY only_inh_parent;
select stadistinct,starelid::regclass,staattnum, stainherit
from pg_statistic
where starelid = ANY ('{only_inh_parent, only_inh_child}'::regclass[]);
stadistinct | starelid | staattnum | stainherit
-------------+-----------------+-----------+------------
80 | only_inh_parent | 1 | t
-0.2 | only_inh_parent | 2 | t
---------------
catalog-pg-statistic.html
stainherit bool
If true, the stats include values from child tables, not just the
values in the specified relation
Normally there is one entry, with stainherit = false, for each table
column that has been analyzed. If the table has inheritance children
or partitions, a second entry with stainherit = true is also created.
This row represents the column's statistics over the inheritance tree,
i.e., statistics for the data you'd see with SELECT column FROM
table*, whereas the stainherit = false row represents the results of
SELECT column FROM ONLY table.
I do understand what Michael Harris said in [1]/messages/by-id/CADofcAW43AD=qqtj1cLkTyRpPM6JB5ZALUK7CA1KZZqpcouoYw@mail.gmail.com
---------------
Given the above context, I am still confused with this sentence in
sql-analyze.html.
"If ONLY is specified before the table name, only that table is analyzed."
like in the above sql example, only_inh_parent's child is also being analyzed.
[1]: /messages/by-id/CADofcAW43AD=qqtj1cLkTyRpPM6JB5ZALUK7CA1KZZqpcouoYw@mail.gmail.com
On Mon, 23 Sept 2024 at 15:29, jian he <jian.universality@gmail.com> wrote:
Given the above context, I am still confused with this sentence in
sql-analyze.html.
"If ONLY is specified before the table name, only that table is analyzed."like in the above sql example, only_inh_parent's child is also being analyzed.
I guess it depends what you define "analyzed" to mean. In this
context, it means gathering statistics specifically for a certain
table.
Would it be more clear if "only that table is analyzed." was changed
to "then statistics are only gathered specifically for that table."?
David
On Mon, Sep 23, 2024 at 12:46 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Mon, 23 Sept 2024 at 15:29, jian he <jian.universality@gmail.com> wrote:
Given the above context, I am still confused with this sentence in
sql-analyze.html.
"If ONLY is specified before the table name, only that table is analyzed."like in the above sql example, only_inh_parent's child is also being analyzed.
I guess it depends what you define "analyzed" to mean. In this
context, it means gathering statistics specifically for a certain
table.Would it be more clear if "only that table is analyzed." was changed
to "then statistics are only gathered specifically for that table."?
looking at expand_vacuum_rel, analyze_rel.
if we
---------
if (onerel->rd_rel->relhassubclass)
do_analyze_rel(onerel, params, va_cols, acquirefunc, relpages,
true, in_outer_xact, elevel);
change to
if (onerel->rd_rel->relhassubclass && ((!relation ||
relation->inh) || onerel->rd_rel->relkind ==
RELKIND_PARTITIONED_TABLE))
do_analyze_rel(onerel, params, va_cols, acquirefunc, relpages,
true, in_outer_xact, elevel);
then the inheritance table will behave the way the doc says.
for example:
drop table if exists only_inh_parent,only_inh_child;
CREATE TABLE only_inh_parent (a int , b INT) with (autovacuum_enabled = false);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent) with
(autovacuum_enabled = false);
INSERT INTO only_inh_child(a,b) select g % 80, (g + 1) % 200 from
generate_series(1,1000) g;
ANALYZE ONLY only_inh_parent;
select stadistinct,starelid::regclass,staattnum, stainherit
from pg_statistic
where starelid = ANY ('{only_inh_parent, only_inh_child}'::regclass[]);
will return zero rows, since the physical table only_inh_parent has no storage.
sql-analyze.html
For partitioned tables, ANALYZE gathers statistics by sampling rows
from all partitions. Each leaf partition is also recursively analyzed
and the statistics updated. This recursive part may be disabled by
using the ONLY keyword, otherwise, leaf partitions are analyzed only
once, even with multi-level partitioning. No statistics are collected
for only the parent table (without data from its partitions), because
with partitioning it's guaranteed to be empty.
allow me to ask anenglish language question.
here "otherwise" means specify ONLY or not?
As far as i understand.
if you specify ONLY, postgres will only do "For partitioned tables,
ANALYZE gathers statistics by sampling rows from all partitions"
if you not specify ONLY, postgres will do "For partitioned tables,
ANALYZE gathers statistics by sampling rows from all partitions *AND*
also recursively analyze each leaf partition"
Is my understanding correct?
I think there is a whitespace error in "ANALYZE ONLY vacparted(a,b); "
in vacuum.out.
On Mon, 23 Sept 2024 at 19:39, jian he <jian.universality@gmail.com> wrote:
sql-analyze.html
For partitioned tables, ANALYZE gathers statistics by sampling rows
from all partitions. Each leaf partition is also recursively analyzed
and the statistics updated. This recursive part may be disabled by
using the ONLY keyword, otherwise, leaf partitions are analyzed only
once, even with multi-level partitioning. No statistics are collected
for only the parent table (without data from its partitions), because
with partitioning it's guaranteed to be empty.allow me to ask anenglish language question.
here "otherwise" means specify ONLY or not?
As far as i understand.
if you specify ONLY, postgres will only do "For partitioned tables,
ANALYZE gathers statistics by sampling rows from all partitions"
if you not specify ONLY, postgres will do "For partitioned tables,
ANALYZE gathers statistics by sampling rows from all partitions *AND*
also recursively analyze each leaf partition"Is my understanding correct?
The "Otherwise" case applies when "ONLY" isn't used.
If this is confusing, I think there's a bunch of detail that I tried
to keep that's just not that useful. The part about analyzing
partitions just once and the part about not collecting non-inheritance
stats for the partitioned table seems like extra detail that's either
obvious or just not that important.
Can you have a look at the attached and let me know if it's easier to
understand now?
David
Attachments:
v8-0001-Add-ONLY-support-for-VACUUM-and-ANALYZE.patchapplication/octet-stream; name=v8-0001-Add-ONLY-support-for-VACUUM-and-ANALYZE.patchDownload
From 2218eda1bdafe47bc1bdfffeb9466d9f99f8d9b6 Mon Sep 17 00:00:00 2001
From: Michael Harris <harmic@gmail.com>
Date: Fri, 20 Sep 2024 11:17:07 +1000
Subject: [PATCH v8 1/2] Add ONLY support for VACUUM and ANALYZE
Since autovacuum does not trigger an ANALYZE for partitioned tables,
users must perform these manually. However, performing a manual ANALYZE
on a partitioned table would always result in recursively analyzing each
partition and that could be undesirable as autovacuum takes care of that.
For partitioned tables that contain a large number of partitions, having
to analyze each partition could take an unreasonably long time, especially
so for tables with a large number of columns.
Here we allow the ONLY keyword to prefix the name of the table to allow
users to have ANALYZE skip processing partitions. This option can also
be used with VACUUM, but there is no work to do if VACUUM ONLY is used on
a partitioned table.
This commit also changes the behavior of VACUUM and ANALYZE for
inheritance parents. Previously inheritance child tables would not be
processed when operating on the parent. Now, by default we *do* operate
on the child tables. ONLY can be used to obtain the old behavior.
The release notes should note this as an incompatibility. The default
behavior has not changed for partitioned tables as these always
recursively processed the partitions.
Author: Michael Harris <harmic@gmail.com>
Discussion: https://postgr.es/m/CADofcAWATx_haD=QkSxHbnTsAe6+e0Aw8Eh4H8cXyogGvn_kOg@mail.gmail.com
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Melih Mutlu <m.melihmutlu@gmail.com>
Reviewed-by: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Reviewed-by: jian he <jian.universality@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
---
doc/src/sgml/ddl.sgml | 22 ++++----
doc/src/sgml/monitoring.sgml | 6 +--
doc/src/sgml/ref/analyze.sgml | 36 +++++++------
doc/src/sgml/ref/vacuum.sgml | 10 ++--
src/backend/commands/vacuum.c | 39 ++++++++++----
src/backend/parser/gram.y | 2 +-
src/test/regress/expected/vacuum.out | 80 ++++++++++++++++++++++++++++
src/test/regress/sql/vacuum.sql | 52 ++++++++++++++++++
8 files changed, 203 insertions(+), 44 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index b671858627..c7de29ae71 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3778,12 +3778,14 @@ VALUES ('Albany', NULL, NULL, 'NY');
not <literal>INSERT</literal> or <literal>ALTER TABLE ...
RENAME</literal>) typically default to including child tables and
support the <literal>ONLY</literal> notation to exclude them.
- Commands that do database maintenance and tuning
- (e.g., <literal>REINDEX</literal>, <literal>VACUUM</literal>)
- typically only work on individual, physical tables and do not
- support recursing over inheritance hierarchies. The respective
- behavior of each individual command is documented in its reference
- page (<xref linkend="sql-commands"/>).
+ Some commands that do database maintenance and tuning
+ (e.g., <literal>REINDEX</literal>) only work on individual, physical
+ tables and do not support recursing over inheritance hierarchies,
+ while other maintenance commands (e.g., <literal>VACUUM</literal>,
+ <literal>ANALYZE</literal>) default to including child tables
+ and support the <literal>ONLY</literal> notation to exclude them.
+ The respective behavior of each individual command is documented
+ in its reference page (<xref linkend="sql-commands"/>).
</para>
<para>
@@ -4854,11 +4856,11 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<listitem>
<para>
- If you are using manual <command>VACUUM</command> or
- <command>ANALYZE</command> commands, don't forget that
- you need to run them on each child table individually. A command like:
+ Manual <command>VACUUM</command> or <command>ANALYZE</command> commands
+ will automatically process all descendant tables. If this is undesirable,
+ you can use the <literal>ONLY</literal> keyword. A command like:
<programlisting>
-ANALYZE measurement;
+ANALYZE ONLY measurement;
</programlisting>
will only process the root table.
</para>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index a2fda4677d..82d6aef858 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5525,9 +5525,9 @@ FROM pg_stat_get_backend_idset() AS backendid;
<note>
<para>
- Note that when <command>ANALYZE</command> is run on a partitioned table,
- all of its partitions are also recursively analyzed.
- In that case, <command>ANALYZE</command>
+ Note that when <command>ANALYZE</command> is run on a partitioned table
+ without the <literal>ONLY</literal> keyword, all of its partitions are
+ also recursively analyzed. In that case, <command>ANALYZE</command>
progress is reported first for the parent table, whereby its inheritance
statistics are collected, followed by that for each partition.
</para>
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 2b94b378e9..92e465ca59 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -31,7 +31,7 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -142,9 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If the specified table is a partitioned table, both the
- inheritance statistics of the partitioned table as a whole and
- statistics of the individual partitions are updated.
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are analyzed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
@@ -284,22 +287,23 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<command>ANALYZE</command> gathers two sets of statistics: one on the rows
of the parent table only, and a second including rows of both the parent
table and all of its children. This second set of statistics is needed when
- planning queries that process the inheritance tree as a whole. The child
- tables themselves are not individually analyzed in this case.
- The autovacuum daemon, however, will only consider inserts or
- updates on the parent table itself when deciding whether to trigger an
- automatic analyze for that table. If that table is rarely inserted into
- or updated, the inheritance statistics will not be up to date unless you
- run <command>ANALYZE</command> manually.
+ planning queries that process the inheritance tree as a whole. If the
+ <literal>ONLY</literal> keyword is used, child tables themselves are not
+ individually analyzed. The autovacuum daemon, however, will only consider
+ inserts or updates on the parent table itself when deciding whether to
+ trigger an automatic analyze for that table. If that table is rarely
+ inserted into or updated, the inheritance statistics will not be up to
+ date unless you run <command>ANALYZE</command> manually.
</para>
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
- sampling rows from all partitions; in addition, it will recurse into each
- partition and update its statistics. Each leaf partition is analyzed only
- once, even with multi-level partitioning. No statistics are collected for
- only the parent table (without data from its partitions), because with
- partitioning it's guaranteed to be empty.
+ sampling rows from all partitions. If <literal>ONLY</literal> is not
+ specified, it will also recurse into each partition and update its
+ statistics. Each leaf partition is analyzed only once, even with
+ multi-level partitioning. No statistics are collected for only the
+ parent table (without data from its partitions), because with
+ partitioning it is guaranteed to be empty.
</para>
<para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 9857b35627..465f9fecf7 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -42,7 +42,7 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<phrase>and <replaceable class="parameter">table_and_columns</replaceable> is:</phrase>
- <replaceable class="parameter">table_name</replaceable> [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
+ [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ * ] [ ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ]
</synopsis>
</refsynopsisdiv>
@@ -401,8 +401,12 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If the specified table is a partitioned
- table, all of its leaf partitions are vacuumed.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified before
+ the table name, only that table is vacuumed. If <literal>ONLY</literal>
+ is not specified, the table and all its descendant tables or partitions
+ (if any) are also vacuumed. Optionally, <literal>*</literal>
+ can be specified after the table name to explicitly indicate that
+ descendant tables are included.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 7d8e9d2045..951e00acee 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -851,7 +851,8 @@ vacuum_open_relation(Oid relid, RangeVar *relation, bits32 options,
/*
* Given a VacuumRelation, fill in the table OID if it wasn't specified,
- * and optionally add VacuumRelations for partitions of the table.
+ * and optionally add VacuumRelations for partitions or descendant tables
+ * of the table.
*
* If a VacuumRelation does not have an OID supplied and is a partitioned
* table, an extra entry will be added to the output for each partition.
@@ -879,11 +880,15 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
}
else
{
- /* Process a specific relation, and possibly partitions thereof */
+ /*
+ * Process a specific relation, and possibly partitions and/or child
+ * tables thereof
+ */
Oid relid;
HeapTuple tuple;
Form_pg_class classForm;
- bool include_parts;
+ bool include_children;
+ bool is_partitioned_table;
int rvr_opts;
/*
@@ -944,20 +949,32 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
MemoryContextSwitchTo(oldcontext);
}
+ /*
+ * Vacuuming a partitioned table with ONLY will not do anything since
+ * the partitioned table itself is empty. Issue a warning if the user
+ * requests this.
+ */
+ include_children = vrel->relation->inh;
+ is_partitioned_table = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
+ if ((options & VACOPT_VACUUM) && is_partitioned_table && !include_children)
+ ereport(WARNING,
+ (errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
+ vrel->relation->relname)));
+
- include_parts = (classForm->relkind == RELKIND_PARTITIONED_TABLE);
ReleaseSysCache(tuple);
/*
- * If it is, make relation list entries for its partitions. Note that
- * the list returned by find_all_inheritors() includes the passed-in
- * OID, so we have to skip that. There's no point in taking locks on
- * the individual partitions yet, and doing so would just add
+ * Unless the user has specified ONLY, make relation list entries for
+ * its partitions and/or descendant tables. Note that the list
+ * returned by find_all_inheritors() includes the passed-in OID, so we
+ * have to skip that. There's no point in taking locks on the
+ * individual partitions/tables yet, and doing so would just add
* unnecessary deadlock risk. For this last reason we do not check
- * yet the ownership of the partitions, which get added to the list to
- * process. Ownership will be checked later on anyway.
+ * yet the ownership of the partitions/tables, which get added to the
+ * list to process. Ownership will be checked later on anyway.
*/
- if (include_parts)
+ if (include_children)
{
List *part_oids = find_all_inheritors(relid, NoLock, NULL);
ListCell *part_lc;
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index ab304ca989..b1d4642c59 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -11921,7 +11921,7 @@ opt_name_list:
;
vacuum_relation:
- qualified_name opt_name_list
+ relation_expr opt_name_list
{
$$ = (Node *) makeVacuumRelation($1, InvalidOid, $2);
}
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 2eba712887..7165235ece 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,6 +291,86 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | f | f
+(2 rows)
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+--------------+----------+----------
+ only_parted | t | f
+ only_parted1 | t | f
+(2 rows)
+
+VACUUM ONLY vacparted; -- gives warning
+WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | f | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | f
+(2 rows)
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | f
+ only_inh_parent | t | t
+(2 rows)
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+ relname | analyzed | vacuumed
+-----------------+----------+----------
+ only_inh_child | t | t
+ only_inh_parent | t | t
+(2 rows)
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ERROR: relation "does_not_exist" does not exist
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index 548cd7acca..fc84df0f07 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,6 +233,58 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
+-- ANALYZE ONLY / VACUUM ONLY on partitioned table
+CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
+INSERT INTO only_parted VALUES (1, 'a');
+
+-- Only partitioned table is analyzed
+ANALYZE ONLY only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+-- Partitioned table and partitions are analyzed
+ANALYZE only_parted;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_parted%'
+ ORDER BY relname;
+
+VACUUM ONLY vacparted; -- gives warning
+ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+
+-- ANALYZE ONLY on inherited tables
+CREATE TABLE only_inh_parent (a int primary key, b TEXT);
+CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
+INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
+
+-- Only parent is ANALYZED
+ANALYZE ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are ANALYZED
+ANALYZE only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Only parent is VACUUMED
+VACUUM ONLY only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+-- Parent and child are VACUUMED
+VACUUM only_inh_parent;
+SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
+ WHERE relname like 'only_inh%'
+ ORDER BY relname;
+
+DROP TABLE only_parted CASCADE;
+DROP TABLE only_inh_parent CASCADE;
+
-- parenthesized syntax for ANALYZE
ANALYZE (VERBOSE) does_not_exist;
ANALYZE (nonexistent-arg) does_not_exist;
--
2.34.1
v8-0002-fixup-Implementation-of-the-ONLY-feature-for-ANAL.patchapplication/octet-stream; name=v8-0002-fixup-Implementation-of-the-ONLY-feature-for-ANAL.patchDownload
From c1f364e4a64fec0a7acb9acf3da5acdd2d21789e Mon Sep 17 00:00:00 2001
From: David Rowley <dgrowley@gmail.com>
Date: Sun, 22 Sep 2024 23:56:39 +1200
Subject: [PATCH v8 2/2] fixup! Implementation of the 'ONLY' feature for
ANALYZE and VACUUM.
---
doc/src/sgml/ddl.sgml | 21 +++++-----
doc/src/sgml/monitoring.sgml | 2 +-
doc/src/sgml/ref/analyze.sgml | 36 ++++++++--------
doc/src/sgml/ref/vacuum.sgml | 13 +++---
src/backend/commands/vacuum.c | 12 +++---
src/test/regress/expected/vacuum.out | 60 ++++++++++++++++-----------
src/test/regress/sql/vacuum.sql | 62 +++++++++++++++++-----------
7 files changed, 116 insertions(+), 90 deletions(-)
diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index c7de29ae71..8ab0ddb112 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3778,14 +3778,14 @@ VALUES ('Albany', NULL, NULL, 'NY');
not <literal>INSERT</literal> or <literal>ALTER TABLE ...
RENAME</literal>) typically default to including child tables and
support the <literal>ONLY</literal> notation to exclude them.
- Some commands that do database maintenance and tuning
+ The majority of commands that do database maintenance and tuning
(e.g., <literal>REINDEX</literal>) only work on individual, physical
- tables and do not support recursing over inheritance hierarchies,
- while other maintenance commands (e.g., <literal>VACUUM</literal>,
- <literal>ANALYZE</literal>) default to including child tables
- and support the <literal>ONLY</literal> notation to exclude them.
- The respective behavior of each individual command is documented
- in its reference page (<xref linkend="sql-commands"/>).
+ tables and do not support recursing over inheritance hierarchies.
+ However, both <literal>VACUUM</literal> and <literal>ANALYZE</literal>
+ commands default to including child tables and the <literal>ONLY</literal>
+ notation is supported to allow them to be excluded. The respective
+ behavior of each individual command is documented in its reference page
+ (<xref linkend="sql-commands"/>).
</para>
<para>
@@ -4856,9 +4856,10 @@ ALTER TABLE measurement_y2008m02 INHERIT measurement;
<listitem>
<para>
- Manual <command>VACUUM</command> or <command>ANALYZE</command> commands
- will automatically process all descendant tables. If this is undesirable,
- you can use the <literal>ONLY</literal> keyword. A command like:
+ Manual <command>VACUUM</command> and <command>ANALYZE</command>
+ commands will automatically process all inheritance child tables. If
+ this is undesirable, you can use the <literal>ONLY</literal> keyword.
+ A command like:
<programlisting>
ANALYZE ONLY measurement;
</programlisting>
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 82d6aef858..db7f35a451 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -5527,7 +5527,7 @@ FROM pg_stat_get_backend_idset() AS backendid;
<para>
Note that when <command>ANALYZE</command> is run on a partitioned table
without the <literal>ONLY</literal> keyword, all of its partitions are
- also recursively analyzed. In that case, <command>ANALYZE</command>
+ also recursively analyzed. In that case, <command>ANALYZE</command>
progress is reported first for the parent table, whereby its inheritance
statistics are collected, followed by that for each partition.
</para>
diff --git a/doc/src/sgml/ref/analyze.sgml b/doc/src/sgml/ref/analyze.sgml
index 92e465ca59..a0db56ae74 100644
--- a/doc/src/sgml/ref/analyze.sgml
+++ b/doc/src/sgml/ref/analyze.sgml
@@ -142,12 +142,12 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
The name (possibly schema-qualified) of a specific table to
analyze. If omitted, all regular tables, partitioned tables, and
materialized views in the current database are analyzed (but not
- foreign tables). If <literal>ONLY</literal> is specified before
- the table name, only that table is analyzed. If <literal>ONLY</literal>
- is not specified, the table and all its descendant tables or partitions
- (if any) are analyzed. Optionally, <literal>*</literal>
+ foreign tables). If <literal>ONLY</literal> is specified before
+ the table name, only that table is analyzed. If <literal>ONLY</literal>
+ is not specified, the table and all its inheritance child tables or
+ partitions (if any) are analyzed. Optionally, <literal>*</literal>
can be specified after the table name to explicitly indicate that
- descendant tables are included.
+ inheritance child tables (or partitions) are to be analyzed.
</para>
</listitem>
</varlistentry>
@@ -287,23 +287,23 @@ ANALYZE [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <r
<command>ANALYZE</command> gathers two sets of statistics: one on the rows
of the parent table only, and a second including rows of both the parent
table and all of its children. This second set of statistics is needed when
- planning queries that process the inheritance tree as a whole. If the
- <literal>ONLY</literal> keyword is used, child tables themselves are not
- individually analyzed. The autovacuum daemon, however, will only consider
- inserts or updates on the parent table itself when deciding whether to
- trigger an automatic analyze for that table. If that table is rarely
- inserted into or updated, the inheritance statistics will not be up to
- date unless you run <command>ANALYZE</command> manually.
+ planning queries that process the inheritance tree as a whole. The
+ autovacuum daemon, however, will only consider inserts or updates on the
+ parent table itself when deciding whether to trigger an automatic analyze
+ for that table. If that table is rarely inserted into or updated, the
+ inheritance statistics will not be up to date unless you run
+ <command>ANALYZE</command> manually. By default,
+ <command>ANALYZE</command> will also recursively collect and update the
+ statistics for each inheritance child table. The <literal>ONLY</literal>
+ keyword may be used to disable this.
</para>
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
- sampling rows from all partitions. If <literal>ONLY</literal> is not
- specified, it will also recurse into each partition and update its
- statistics. Each leaf partition is analyzed only once, even with
- multi-level partitioning. No statistics are collected for only the
- parent table (without data from its partitions), because with
- partitioning it is guaranteed to be empty.
+ sampling rows from all partitions. By default,
+ <command>ANALYZE</command> will also recursively collect and update the
+ statistics for each partition. The <literal>ONLY</literal> keyword may be
+ used to disable this.
</para>
<para>
diff --git a/doc/src/sgml/ref/vacuum.sgml b/doc/src/sgml/ref/vacuum.sgml
index 465f9fecf7..9110938fab 100644
--- a/doc/src/sgml/ref/vacuum.sgml
+++ b/doc/src/sgml/ref/vacuum.sgml
@@ -401,12 +401,13 @@ VACUUM [ ( <replaceable class="parameter">option</replaceable> [, ...] ) ] [ <re
<listitem>
<para>
The name (optionally schema-qualified) of a specific table or
- materialized view to vacuum. If <literal>ONLY</literal> is specified before
- the table name, only that table is vacuumed. If <literal>ONLY</literal>
- is not specified, the table and all its descendant tables or partitions
- (if any) are also vacuumed. Optionally, <literal>*</literal>
- can be specified after the table name to explicitly indicate that
- descendant tables are included.
+ materialized view to vacuum. If <literal>ONLY</literal> is specified
+ before the table name, only that table is vacuumed. If
+ <literal>ONLY</literal> is not specified, the table and all its
+ inheritance child tables or partitions (if any) are also vacuumed.
+ Optionally, <literal>*</literal> can be specified after the table name
+ to explicitly indicate that inheritance child tables (or partitions) are
+ to be vacuumed.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index 951e00acee..bcadca4214 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -851,8 +851,7 @@ vacuum_open_relation(Oid relid, RangeVar *relation, bits32 options,
/*
* Given a VacuumRelation, fill in the table OID if it wasn't specified,
- * and optionally add VacuumRelations for partitions or descendant tables
- * of the table.
+ * and optionally add VacuumRelations for partitions or inheritance children.
*
* If a VacuumRelation does not have an OID supplied and is a partitioned
* table, an extra entry will be added to the output for each partition.
@@ -881,8 +880,8 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
else
{
/*
- * Process a specific relation, and possibly partitions and/or child
- * tables thereof
+ * Process a specific relation, and possibly partitions or child
+ * tables thereof.
*/
Oid relid;
HeapTuple tuple;
@@ -951,7 +950,7 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
/*
* Vacuuming a partitioned table with ONLY will not do anything since
- * the partitioned table itself is empty. Issue a warning if the user
+ * the partitioned table itself is empty. Issue a warning if the user
* requests this.
*/
include_children = vrel->relation->inh;
@@ -961,12 +960,11 @@ expand_vacuum_rel(VacuumRelation *vrel, MemoryContext vac_context,
(errmsg("VACUUM ONLY of partitioned table \"%s\" has no effect",
vrel->relation->relname)));
-
ReleaseSysCache(tuple);
/*
* Unless the user has specified ONLY, make relation list entries for
- * its partitions and/or descendant tables. Note that the list
+ * its partitions or inheritance child tables. Note that the list
* returned by find_all_inheritors() includes the passed-in OID, so we
* have to skip that. There's no point in taking locks on the
* individual partitions/tables yet, and doing so would just add
diff --git a/src/test/regress/expected/vacuum.out b/src/test/regress/expected/vacuum.out
index 7165235ece..1a07dbf67d 100644
--- a/src/test/regress/expected/vacuum.out
+++ b/src/test/regress/expected/vacuum.out
@@ -291,14 +291,17 @@ ANALYZE vactst, vactst;
BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
--- ANALYZE ONLY / VACUUM ONLY on partitioned table
-CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+--
+-- Tests for ANALYZE ONLY / VACUUM ONLY on partitioned tables
+--
+CREATE TABLE only_parted (a int, b text) PARTITION BY LIST (a);
CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
INSERT INTO only_parted VALUES (1, 'a');
--- Only partitioned table is analyzed
+-- Ensure only the partitioned table is analyzed
ANALYZE ONLY only_parted;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_parted%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_parted'::regclass, 'only_parted1'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
--------------+----------+----------
@@ -306,10 +309,11 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_parted1 | f | f
(2 rows)
--- Partitioned table and partitions are analyzed
+-- Ensure partitioned table and the partitions are analyzed
ANALYZE only_parted;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_parted%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_parted'::regclass, 'only_parted1'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
--------------+----------+----------
@@ -317,17 +321,23 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_parted1 | t | f
(2 rows)
-VACUUM ONLY vacparted; -- gives warning
+DROP TABLE only_parted;
+-- VACUUM ONLY on a partitioned table does nothing, ensure we get a warning.
+VACUUM ONLY vacparted;
WARNING: VACUUM ONLY of partitioned table "vacparted" has no effect
-ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
--- ANALYZE ONLY on inherited tables
+-- Try ANALYZE ONLY with a column list
+ANALYZE ONLY vacparted(a,b);
+--
+-- Tests for VACUUM ONLY / ANALYZE ONLY on inheritance tables
+--
CREATE TABLE only_inh_parent (a int primary key, b TEXT);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
--- Only parent is ANALYZED
+-- Ensure only parent is analyzed
ANALYZE ONLY only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
-----------------+----------+----------
@@ -335,10 +345,11 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_inh_parent | t | f
(2 rows)
--- Parent and child are ANALYZED
+-- Ensure the parent and child are analyzed
ANALYZE only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
-----------------+----------+----------
@@ -346,10 +357,11 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_inh_parent | t | f
(2 rows)
--- Only parent is VACUUMED
+-- Ensure only the parent is vacuumed
VACUUM ONLY only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
-----------------+----------+----------
@@ -357,10 +369,11 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_inh_parent | t | t
(2 rows)
--- Parent and child are VACUUMED
+-- Ensure parent and child are vacuumed
VACUUM only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
relname | analyzed | vacuumed
-----------------+----------+----------
@@ -368,7 +381,6 @@ SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as
only_inh_parent | t | t
(2 rows)
-DROP TABLE only_parted CASCADE;
DROP TABLE only_inh_parent CASCADE;
NOTICE: drop cascades to table only_inh_child
-- parenthesized syntax for ANALYZE
diff --git a/src/test/regress/sql/vacuum.sql b/src/test/regress/sql/vacuum.sql
index fc84df0f07..5e55079e71 100644
--- a/src/test/regress/sql/vacuum.sql
+++ b/src/test/regress/sql/vacuum.sql
@@ -233,56 +233,70 @@ BEGIN; -- ANALYZE behaves differently inside a transaction block
ANALYZE vactst, vactst;
COMMIT;
--- ANALYZE ONLY / VACUUM ONLY on partitioned table
-CREATE TABLE only_parted (a int, b char) PARTITION BY LIST (a);
+--
+-- Tests for ANALYZE ONLY / VACUUM ONLY on partitioned tables
+--
+CREATE TABLE only_parted (a int, b text) PARTITION BY LIST (a);
CREATE TABLE only_parted1 PARTITION OF only_parted FOR VALUES IN (1);
INSERT INTO only_parted VALUES (1, 'a');
--- Only partitioned table is analyzed
+-- Ensure only the partitioned table is analyzed
ANALYZE ONLY only_parted;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_parted%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_parted'::regclass, 'only_parted1'::regclass)
ORDER BY relname;
--- Partitioned table and partitions are analyzed
+-- Ensure partitioned table and the partitions are analyzed
ANALYZE only_parted;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_parted%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_parted'::regclass, 'only_parted1'::regclass)
ORDER BY relname;
-VACUUM ONLY vacparted; -- gives warning
-ANALYZE ONLY vacparted(a,b); -- combine ONLY with column list
+DROP TABLE only_parted;
+
+-- VACUUM ONLY on a partitioned table does nothing, ensure we get a warning.
+VACUUM ONLY vacparted;
--- ANALYZE ONLY on inherited tables
+-- Try ANALYZE ONLY with a column list
+ANALYZE ONLY vacparted(a,b);
+
+--
+-- Tests for VACUUM ONLY / ANALYZE ONLY on inheritance tables
+--
CREATE TABLE only_inh_parent (a int primary key, b TEXT);
CREATE TABLE only_inh_child () INHERITS (only_inh_parent);
INSERT INTO only_inh_child(a,b) VALUES (1, 'aaa'), (2, 'bbb'), (3, 'ccc');
--- Only parent is ANALYZED
+-- Ensure only parent is analyzed
ANALYZE ONLY only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
--- Parent and child are ANALYZED
+-- Ensure the parent and child are analyzed
ANALYZE only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
--- Only parent is VACUUMED
+-- Ensure only the parent is vacuumed
VACUUM ONLY only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
--- Parent and child are VACUUMED
+-- Ensure parent and child are vacuumed
VACUUM only_inh_parent;
-SELECT relname, last_analyze is not null as analyzed, last_vacuum is not null as vacuumed FROM pg_stat_user_tables
- WHERE relname like 'only_inh%'
+SELECT relname, last_analyze IS NOT NULL AS analyzed, last_vacuum IS NOT NULL AS vacuumed
+ FROM pg_stat_user_tables
+ WHERE relid IN ('only_inh_parent'::regclass, 'only_inh_child'::regclass)
ORDER BY relname;
-DROP TABLE only_parted CASCADE;
DROP TABLE only_inh_parent CASCADE;
-- parenthesized syntax for ANALYZE
--
2.34.1
On Mon, Sep 23, 2024 at 6:04 PM David Rowley <dgrowleyml@gmail.com> wrote:
If this is confusing, I think there's a bunch of detail that I tried
to keep that's just not that useful. The part about analyzing
partitions just once and the part about not collecting non-inheritance
stats for the partitioned table seems like extra detail that's either
obvious or just not that important.Can you have a look at the attached and let me know if it's easier to
understand now?
Now the regress test passed.
<para>
For partitioned tables, <command>ANALYZE</command> gathers statistics by
sampling rows from all partitions. By default,
<command>ANALYZE</command> will also recursively collect and update the
statistics for each partition. The <literal>ONLY</literal> keyword may be
used to disable this.
</para>
is very clear to me!
<para>
If the table being analyzed has inheritance children,
<command>ANALYZE</command> gathers two sets of statistics: one on the rows
of the parent table only, and a second including rows of both the parent
table and all of its children. This second set of statistics is needed when
planning queries that process the inheritance tree as a whole. The
autovacuum daemon, however, will only consider inserts or updates on the
parent table itself when deciding whether to trigger an automatic analyze
for that table. If that table is rarely inserted into or updated, the
inheritance statistics will not be up to date unless you run
<command>ANALYZE</command> manually. By default,
<command>ANALYZE</command> will also recursively collect and update the
statistics for each inheritance child table. The <literal>ONLY</literal>
keyword may be used to disable this.
</para>
looks fine. but maybe we can add the following information
"if The <literal>ONLY</literal> is specified, the second set of
statistics won't include each children individual statistics"
I think that's the main difference between specifying ONLY or not?
catalog-pg-statistic.html second paragraph seems very clear to me.
Maybe we can link it somehow
Other than that, it looks good to me.
On Mon, 23 Sept 2024 at 23:23, jian he <jian.universality@gmail.com> wrote:
looks fine. but maybe we can add the following information
"if The <literal>ONLY</literal> is specified, the second set of
statistics won't include each children individual statistics"
I think that's the main difference between specifying ONLY or not?
Ok, I think you're not understanding this yet and I'm not sure what I
can make more clear in the documents.
Let me explain... For inheritance parent tables, ANALYZE ONLY will
gather inheritance and non-inheritance statistics for ONLY the parent.
Here's an example of that:
drop table if exists parent,child;
create table parent(a int);
create table child () inherits (parent);
insert into parent values(1);
insert into child values(1);
analyze ONLY parent;
select starelid::regclass,stainherit,stadistinct from pg_statistic
where starelid in ('parent'::regclass,'child'::regclass);
starelid | stainherit | stadistinct
----------+------------+-------------
parent | f | -1 <- this is the distinct estimate
for SELECT * FROM ONLY parent;
parent | t | -0.5 <- this is the distinct estimate
for SELECT * FROM parent;
(2 rows)
For the stainherit==false stats, only 1 row is sampled here as that's
the only row directly located in the "parent" table.
For the stainherit==true stats, 2 rows are sampled, both of them have
"a" == 1. The stadistinct reflects that fact.
Note there have been no statistics recorded for "child". However,
analyze did sample rows in that table as part of gathering sample rows
for "parent" for the stainherit==true row.
Now let's try again without ONLY.
analyze parent;
select starelid::regclass,stainherit,stadistinct from pg_statistic
where starelid in ('parent'::regclass,'child'::regclass);
starelid | stainherit | stadistinct
----------+------------+-------------
parent | f | -1
parent | t | -0.5
child | f | -1
(3 rows)
All of the above rows were re-calculated with the "analyze parent"
command, the first two rows have the same values as nothing changed in
the table, however, there are now statistics stored for the "child"
table.
catalog-pg-statistic.html second paragraph seems very clear to me.
Maybe we can link it somehow
I don't know what "link it" means in this context.
David
On Mon, Sep 23, 2024 at 7:53 PM David Rowley <dgrowleyml@gmail.com> wrote:
On Mon, 23 Sept 2024 at 23:23, jian he <jian.universality@gmail.com> wrote:
looks fine. but maybe we can add the following information
"if The <literal>ONLY</literal> is specified, the second set of
statistics won't include each children individual statistics"
I think that's the main difference between specifying ONLY or not?Ok, I think you're not understanding this yet and I'm not sure what I
can make more clear in the documents.Let me explain... For inheritance parent tables, ANALYZE ONLY will
gather inheritance and non-inheritance statistics for ONLY the parent.Here's an example of that:
drop table if exists parent,child;
create table parent(a int);
create table child () inherits (parent);
insert into parent values(1);
insert into child values(1);analyze ONLY parent;
select starelid::regclass,stainherit,stadistinct from pg_statistic
where starelid in ('parent'::regclass,'child'::regclass);
starelid | stainherit | stadistinct
----------+------------+-------------
parent | f | -1 <- this is the distinct estimate
for SELECT * FROM ONLY parent;
parent | t | -0.5 <- this is the distinct estimate
for SELECT * FROM parent;
(2 rows)For the stainherit==false stats, only 1 row is sampled here as that's
the only row directly located in the "parent" table.
For the stainherit==true stats, 2 rows are sampled, both of them have
"a" == 1. The stadistinct reflects that fact.Note there have been no statistics recorded for "child". However,
analyze did sample rows in that table as part of gathering sample rows
for "parent" for the stainherit==true row.Now let's try again without ONLY.
analyze parent;
select starelid::regclass,stainherit,stadistinct from pg_statistic
where starelid in ('parent'::regclass,'child'::regclass);
starelid | stainherit | stadistinct
----------+------------+-------------
parent | f | -1
parent | t | -0.5
child | f | -1
(3 rows)All of the above rows were re-calculated with the "analyze parent"
command, the first two rows have the same values as nothing changed in
the table, however, there are now statistics stored for the "child"
table.
thanks for your explanation!
now I don't have any questions about this patch.
catalog-pg-statistic.html second paragraph seems very clear to me.
Maybe we can link it somehowI don't know what "link it" means in this context.
i mean, change to:
By default,
<command>ANALYZE</command> will also recursively collect and update the
statistics for each inheritance child table. The <literal>ONLY</literal>
keyword may be used to disable this.
You may also refer to catalog <link
linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>
description about <literal>stainherit</literal>.
but <link linkend="catalog-pg-statistic"><structname>pg_statistic</structname></link>
already mentioned once.
maybe not a good idea.