document the need to analyze partitioned tables

pryzby@telsasoft.com

over 4 years ago

In reply to: Justin Pryzby (#1)

Re: document the need to analyze partitioned tables

Cleaned up and attached as a .patch.

The patch implementing autoanalyze on partitioned tables should revert relevant
portions of this patch.

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 4 years ago

In reply to: Justin Pryzby (#3)

Re: document the need to analyze partitioned tables

Hi,

On 10/8/21 14:58, Justin Pryzby wrote:

Cleaned up and attached as a .patch.

The patch implementing autoanalyze on partitioned tables should
revert relevant portions of this patch.

I went through this patch and I'd like to propose a couple changes, per
the 0002 patch:

1) I've reworded the changes in maintenance.sgml a bit. It sounded a bit
strange before, but I'm not a native speaker so maybe it's worse ...

2) Remove unnecessary whitespace changes in perform.sgml.

3) Simplify the analyze.sgml changes a bit - it was trying to cram too
much stuff into a single paragraph, so I split that.

Does that seem OK, or did omit something important?

FWIW I think it's really confusing we have inheritance and partitioning,
and partitions and child tables. And sometimes we use partitioning in
the generic sense (i.e. including the inheritance approach), and
sometimes only the declarative variant. Same for partitions vs child
tables. I can't even imagine how confusing this has to be for people
just learning this stuff. They must be in permanent WTF?! state ...

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pryzby@telsasoft.com

over 4 years ago

In reply to: Tomas Vondra (#4)

Re: document the need to analyze partitioned tables

Thanks for looking at this

On Fri, Jan 21, 2022 at 06:21:57PM +0100, Tomas Vondra wrote:

Hi,

On 10/8/21 14:58, Justin Pryzby wrote:

Cleaned up and attached as a .patch.

The patch implementing autoanalyze on partitioned tables should
revert relevant portions of this patch.

I went through this patch and I'd like to propose a couple changes, per the
0002 patch:

1) I've reworded the changes in maintenance.sgml a bit. It sounded a bit
strange before, but I'm not a native speaker so maybe it's worse ...

+     autoanalyze on the parent table.  If your queries require statistics on                                                                       
+     parent relations for proper planning, it's necessary to periodically run

You added two references to "relations", but everything else talks about
"tables", which is all that analyze processes.

2) Remove unnecessary whitespace changes in perform.sgml.

Those were a note to myself and to any reviewer - should that be updated too ?

3) Simplify the analyze.sgml changes a bit - it was trying to cram too much
stuff into a single paragraph, so I split that.

Does that seem OK, or did omit something important?

+ If the table being analyzed has one or more children,

I think you're referring to both legacy inheritance and and partitioning. That
should be more clear.

+    <command>ANALYZE</command> gathers two sets of statistics: once on the rows
+    of the parent table only, and a second one including rows of both the parent
+    table and all child relations.  This second set of statistics is needed when

I think should say ".. and all of its children".

FWIW I think it's really confusing we have inheritance and partitioning, and
partitions and child tables. And sometimes we use partitioning in the
generic sense (i.e. including the inheritance approach), and sometimes only
the declarative variant. Same for partitions vs child tables. I can't even
imagine how confusing this has to be for people just learning this stuff.
They must be in permanent WTF?! state ...

The docs were cleaned up some in 0c06534bd. At least the word "partitioned"
should never be used for legacy inheritance - but "partitioning" is.

--
Justin

Tomas Vondra

tomas.vondra@2ndquadrant.com

over 4 years ago

In reply to: Justin Pryzby (#5)

Re: document the need to analyze partitioned tables

On 1/21/22 19:02, Justin Pryzby wrote:

Thanks for looking at this

On Fri, Jan 21, 2022 at 06:21:57PM +0100, Tomas Vondra wrote:

Hi,

On 10/8/21 14:58, Justin Pryzby wrote:

Cleaned up and attached as a .patch.

The patch implementing autoanalyze on partitioned tables should
revert relevant portions of this patch.

I went through this patch and I'd like to propose a couple changes, per the
0002 patch:

1) I've reworded the changes in maintenance.sgml a bit. It sounded a bit
strange before, but I'm not a native speaker so maybe it's worse ...
+     autoanalyze on the parent table.  If your queries require statistics on
+     parent relations for proper planning, it's necessary to periodically run
You added two references to "relations", but everything else talks about
"tables", which is all that analyze processes.

Good point, that should use "tables" too.

2) Remove unnecessary whitespace changes in perform.sgml.

Those were a note to myself and to any reviewer - should that be updated too ?

Ah, I see. I don't think that part needs updating - it talks about
having to analyze after a bulk load, and that applies to all tables
anyway. I don't think it needs to mention partitioned tables need an
analyze too.

3) Simplify the analyze.sgml changes a bit - it was trying to cram too much
stuff into a single paragraph, so I split that.

Does that seem OK, or did omit something important?

+ If the table being analyzed has one or more children,

I think you're referring to both legacy inheritance and and partitioning. That
should be more clear.

I think it applies to both types of partitioning - it's just that in the
declarative partitioning case the table is always empty so no stats with
inherit=false are built.

+    <command>ANALYZE</command> gathers two sets of statistics: once on the rows
+    of the parent table only, and a second one including rows of both the parent
+    table and all child relations.  This second set of statistics is needed when

I think should say ".. and all of its children".

FWIW I think it's really confusing we have inheritance and partitioning, and
partitions and child tables. And sometimes we use partitioning in the
generic sense (i.e. including the inheritance approach), and sometimes only
the declarative variant. Same for partitions vs child tables. I can't even
imagine how confusing this has to be for people just learning this stuff.
They must be in permanent WTF?! state ...

The docs were cleaned up some in 0c06534bd. At least the word "partitioned"
should never be used for legacy inheritance - but "partitioning" is.

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Robert Haas

robertmhaas@gmail.com

about 4 years ago

In reply to: Tomas Vondra (#6)

Re: document the need to analyze partitioned tables

On Fri, Jan 21, 2022 at 1:31 PM Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:

[ new patch ]

This patch is originally by Justin. The latest version is by Tomas. I
think the next step is for Justin to say whether he's OK with the
latest version that Tomas posted. If he is, then I suggest that he
also mark it Ready for Committer, and that Tomas commit it. If he's
not, he should say what he wants changed and either post a new version
himself or wait for Tomas to do that.

I think the fact that is classified as a "Bug Fix" in the CommitFest
application is not particularly good. I would prefer to see it
classified under "Documentation". I'm prepared to concede that
documentation can have bugs as a general matter, but nobody's data is
getting eaten because the documentation wasn't updated. In fact, this
is the fourth patch from the "bug fix" section I've studied this
afternoon, and, well, none of them have been back-patchable code
defects.

--
Robert Haas
EDB: http://www.enterprisedb.com

pryzby@telsasoft.com

about 4 years ago

In reply to: Robert Haas (#7)

Re: document the need to analyze partitioned tables

On Mon, Mar 14, 2022 at 05:23:54PM -0400, Robert Haas wrote:

On Fri, Jan 21, 2022 at 1:31 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

[ new patch ]

This patch is originally by Justin. The latest version is by Tomas. I
think the next step is for Justin to say whether he's OK with the
latest version that Tomas posted. If he is, then I suggest that he
also mark it Ready for Committer, and that Tomas commit it. If he's
not, he should say what he wants changed and either post a new version
himself or wait for Tomas to do that.

Yes, I think it can be Ready. Done.

I amended some of Tomas' changes (see 0003, attached as txt).

@cfbot: the *.patch file is for your consumption, and the others are only there
to show my changes.

I think the fact that is classified as a "Bug Fix" in the CommitFest
application is not particularly good. I would prefer to see it
classified under "Documentation". I'm prepared to concede that
documentation can have bugs as a general matter, but nobody's data is
getting eaten because the documentation wasn't updated. In fact, this
is the fourth patch from the "bug fix" section I've studied this
afternoon, and, well, none of them have been back-patchable code
defects.

In fact, I consider this to be back-patchable back to v10. IMO it's an
omission that this isn't documented. Not all bugs cause data to be eaten. If
someone reads the existing documentation, they might conclude that their
partitioned tables don't need to be analyzed, and they would've been better
served by not reading the docs.

--
Justin

Tomas Vondra

tomas.vondra@2ndquadrant.com

about 4 years ago

In reply to: Justin Pryzby (#8)

Re: document the need to analyze partitioned tables

On 3/16/22 00:00, Justin Pryzby wrote:

On Mon, Mar 14, 2022 at 05:23:54PM -0400, Robert Haas wrote:

On Fri, Jan 21, 2022 at 1:31 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

[ new patch ]

This patch is originally by Justin. The latest version is by Tomas. I
think the next step is for Justin to say whether he's OK with the
latest version that Tomas posted. If he is, then I suggest that he
also mark it Ready for Committer, and that Tomas commit it. If he's
not, he should say what he wants changed and either post a new version
himself or wait for Tomas to do that.

Yes, I think it can be Ready. Done.

I amended some of Tomas' changes (see 0003, attached as txt).

@cfbot: the *.patch file is for your consumption, and the others are only there
to show my changes.

I think the fact that is classified as a "Bug Fix" in the CommitFest
application is not particularly good. I would prefer to see it
classified under "Documentation". I'm prepared to concede that
documentation can have bugs as a general matter, but nobody's data is
getting eaten because the documentation wasn't updated. In fact, this
is the fourth patch from the "bug fix" section I've studied this
afternoon, and, well, none of them have been back-patchable code
defects.

In fact, I consider this to be back-patchable back to v10. IMO it's an
omission that this isn't documented. Not all bugs cause data to be eaten. If
someone reads the existing documentation, they might conclude that their
partitioned tables don't need to be analyzed, and they would've been better
served by not reading the docs.

I've pushed the last version, and backpatched it to 10 (not sure I'd
call it a bugfix, but I certainly agree with Justin it's worth
mentioning in the docs, even on older branches).

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#10

Daniel Gustafsson

daniel@yesql.se

about 4 years ago

In reply to: Tomas Vondra (#9)

Re: document the need to analyze partitioned tables

On 28 Mar 2022, at 15:05, Tomas Vondra <tomas.vondra@enterprisedb.com> wrote:

I've pushed the last version, and backpatched it to 10 (not sure I'd
call it a bugfix, but I certainly agree with Justin it's worth
mentioning in the docs, even on older branches).

I happened to spot a small typo in this commit in the ANALYZE docs, and have
just pushed a fix all the way down to 10 as per the original commit.

--
Daniel Gustafsson https://vmware.com/

#11

[0]: https://www.postgresql.org/docs/devel/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE

laurenz.albe@cybertec.at

over 3 years ago

In reply to: Tomas Vondra (#9)

Re: document the need to analyze partitioned tables

On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:

I've pushed the last version, and backpatched it to 10 (not sure I'd
call it a bugfix, but I certainly agree with Justin it's worth
mentioning in the docs, even on older branches).

I'd like to suggest an improvement to this. The current wording could
be read to mean that dead tuples won't get cleaned up in partitioned tables.

By the way, where are the statistics of a partitioned tables used? The actual
tables scanned are always the partitions, and in the execution plans that
I have seen, the optimizer always used the statistics of the partitions.

Yours,
Laurenz Albe

#12

Andrei Lepikhov

lepihov@gmail.com

over 3 years ago

In reply to: Laurenz Albe (#11)

Re: document the need to analyze partitioned tables

On 10/5/22 13:37, Laurenz Albe wrote:

On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:

I've pushed the last version, and backpatched it to 10 (not sure I'd
call it a bugfix, but I certainly agree with Justin it's worth
mentioning in the docs, even on older branches).

I'd like to suggest an improvement to this. The current wording could
be read to mean that dead tuples won't get cleaned up in partitioned tables.

By the way, where are the statistics of a partitioned tables used? The actual
tables scanned are always the partitions, and in the execution plans that
I have seen, the optimizer always used the statistics of the partitions.

For example, it is used to estimate selectivity of join clause:

CREATE TABLE test (id integer, val integer) PARTITION BY hash (id);
CREATE TABLE test_0 PARTITION OF test
FOR VALUES WITH (modulus 2, remainder 0);
CREATE TABLE test_1 PARTITION OF test
FOR VALUES WITH (modulus 2, remainder 1);

INSERT INTO test (SELECT q, q FROM generate_series(1,10) AS q);
VACUUM ANALYZE test;
INSERT INTO test (SELECT q, q%2 FROM generate_series(11,200) AS q);
VACUUM ANALYZE test_0,test_1;

EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;
VACUUM ANALYZE test;
EXPLAIN (ANALYZE, TIMING OFF, SUMMARY OFF)
SELECT * FROM test t1, test t2 WHERE t1.id = t2.val;

Here without actual statistics on parent table we make wrong prediction.

--
Regards
Andrey Lepikhov
Postgres Professional

#13

Nathan Bossart

nathandbossart@gmail.com

over 3 years ago

In reply to: Laurenz Albe (#11)

Re: document the need to analyze partitioned tables

On Wed, Oct 05, 2022 at 10:37:01AM +0200, Laurenz Albe wrote:

On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:

I've pushed the last version, and backpatched it to 10 (not sure I'd
call it a bugfix, but I certainly agree with Justin it's worth
mentioning in the docs, even on older branches).

I'd like to suggest an improvement to this. The current wording could
be read to mean that dead tuples won't get cleaned up in partitioned tables.

Well, dead tuples won't get cleaned up in partitioned tables, as
partitioned tables do not have storage. But I see what you mean. Readers
might misinterpret this to mean that autovacuum will not process the
partitions. There's a good definition of what the docs mean by
"partitioned table" [0]https://www.postgresql.org/docs/devel/ddl-partitioning.html#DDL-PARTITIONING-DECLARATIVE, but FWIW it took me some time before I
consistently read "partitioned table" to mean "only the thing with relkind
set to 'p'" and not "both the partitioned table and its partitions." So,
while the current wording it technically correct, I think it'd be
reasonable to expand it to help avoid confusion.

Here is my take on the wording:

Since all the data for a partitioned table is stored in its partitions,
autovacuum does not process partitioned tables. Instead, autovacuum
processes the individual partitions that are regular tables. This
means that autovacuum only gathers statistics for the regular tables
that serve as partitions and not for the partitioned tables. Since
queries may rely on a partitioned table's statistics, you should
collect statistics via the ANALYZE command when it is first populated,
and again whenever the distribution of data in its partitions changes
significantly.

--
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

#14

bruce@momjian.us

over 3 years ago

In reply to: Nathan Bossart (#13)

Re: document the need to analyze partitioned tables

On Thu, Jan 12, 2023 at 03:27:47PM -0800, Nathan Bossart wrote:

On Wed, Oct 05, 2022 at 10:37:01AM +0200, Laurenz Albe wrote:

On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:

I've pushed the last version, and backpatched it to 10 (not sure I'd
call it a bugfix, but I certainly agree with Justin it's worth
mentioning in the docs, even on older branches).

I'd like to suggest an improvement to this. The current wording could
be read to mean that dead tuples won't get cleaned up in partitioned tables.

Well, dead tuples won't get cleaned up in partitioned tables, as
partitioned tables do not have storage. But I see what you mean. Readers
might misinterpret this to mean that autovacuum will not process the
partitions. There's a good definition of what the docs mean by
"partitioned table" [0], but FWIW it took me some time before I
consistently read "partitioned table" to mean "only the thing with relkind
set to 'p'" and not "both the partitioned table and its partitions." So,
while the current wording it technically correct, I think it'd be
reasonable to expand it to help avoid confusion.

Here is my take on the wording:

Since all the data for a partitioned table is stored in its partitions,
autovacuum does not process partitioned tables. Instead, autovacuum
processes the individual partitions that are regular tables. This
means that autovacuum only gathers statistics for the regular tables
that serve as partitions and not for the partitioned tables. Since
queries may rely on a partitioned table's statistics, you should
collect statistics via the ANALYZE command when it is first populated,
and again whenever the distribution of data in its partitions changes
significantly.

Uh, what about autovacuum's handling of partitioned tables? This makes
it sound like it ignores them because it talks about manual ANALYZE.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#15

pryzby@telsasoft.com

over 3 years ago

In reply to: Bruce Momjian (#14)

Re: document the need to analyze partitioned tables

On Tue, Jan 17, 2023 at 03:53:24PM -0500, Bruce Momjian wrote:

On Thu, Jan 12, 2023 at 03:27:47PM -0800, Nathan Bossart wrote:

On Wed, Oct 05, 2022 at 10:37:01AM +0200, Laurenz Albe wrote:

On Mon, 2022-03-28 at 15:05 +0200, Tomas Vondra wrote:

I've pushed the last version, and backpatched it to 10 (not sure I'd
call it a bugfix, but I certainly agree with Justin it's worth
mentioning in the docs, even on older branches).

I'd like to suggest an improvement to this. The current wording could
be read to mean that dead tuples won't get cleaned up in partitioned tables.

Well, dead tuples won't get cleaned up in partitioned tables, as
partitioned tables do not have storage. But I see what you mean. Readers
might misinterpret this to mean that autovacuum will not process the
partitions. There's a good definition of what the docs mean by
"partitioned table" [0], but FWIW it took me some time before I
consistently read "partitioned table" to mean "only the thing with relkind
set to 'p'" and not "both the partitioned table and its partitions." So,
while the current wording it technically correct, I think it'd be
reasonable to expand it to help avoid confusion.

Here is my take on the wording:

Since all the data for a partitioned table is stored in its partitions,
autovacuum does not process partitioned tables. Instead, autovacuum
processes the individual partitions that are regular tables. This
means that autovacuum only gathers statistics for the regular tables
that serve as partitions and not for the partitioned tables. Since
queries may rely on a partitioned table's statistics, you should
collect statistics via the ANALYZE command when it is first populated,
and again whenever the distribution of data in its partitions changes
significantly.

Uh, what about autovacuum's handling of partitioned tables? This makes
it sound like it ignores them because it talks about manual ANALYZE.

If we're referring to the *partitioned* table, then it does ignore them.
See:

|commit 6f8127b7390119c21479f5ce495b7d2168930e82
|Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
|Date: Mon Aug 16 17:27:52 2021 -0400
|
| Revert analyze support for partitioned tables

Maybe (all?) the clarification the docs need is to say:
"Partitioned tables are not *themselves* processed by autovacuum."

--
Justin

#16

bruce@momjian.us

over 3 years ago

In reply to: Justin Pryzby (#15)

Re: document the need to analyze partitioned tables

On Tue, Jan 17, 2023 at 03:00:50PM -0600, Justin Pryzby wrote:

On Tue, Jan 17, 2023 at 03:53:24PM -0500, Bruce Momjian wrote:

On Thu, Jan 12, 2023 at 03:27:47PM -0800, Nathan Bossart wrote:

Here is my take on the wording:

Since all the data for a partitioned table is stored in its partitions,
autovacuum does not process partitioned tables. Instead, autovacuum
processes the individual partitions that are regular tables. This
means that autovacuum only gathers statistics for the regular tables
that serve as partitions and not for the partitioned tables. Since
queries may rely on a partitioned table's statistics, you should
collect statistics via the ANALYZE command when it is first populated,
and again whenever the distribution of data in its partitions changes
significantly.

Uh, what about autovacuum's handling of partitioned tables? This makes
it sound like it ignores them because it talks about manual ANALYZE.

If we're referring to the *partitioned* table, then it does ignore them.
See:

|commit 6f8127b7390119c21479f5ce495b7d2168930e82
|Author: Alvaro Herrera <alvherre@alvh.no-ip.org>
|Date: Mon Aug 16 17:27:52 2021 -0400
|
| Revert analyze support for partitioned tables

Yes, I see that patch was trying to combine the statistics of individual
partitions into a partitioned table summary.

Maybe (all?) the clarification the docs need is to say:
"Partitioned tables are not *themselves* processed by autovacuum."

Yes, I think the lack of autovacuum needs to be specifically mentioned
since most people assume autovacuum handles _all_ statistics updating.

Can someone summarize how bad it is we have no statistics on partitioned
tables? It sounds bad to me.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#17

[1]: /messages/by-id/3df5c68b-13aa-53d0-c0ec-ed98e6972e2e@postgrespro.ru

laurenz.albe@cybertec.at

over 3 years ago

In reply to: Bruce Momjian (#16)

Re: document the need to analyze partitioned tables

On Tue, 2023-01-17 at 16:16 -0500, Bruce Momjian wrote:

On Tue, Jan 17, 2023 at 03:00:50PM -0600, Justin Pryzby wrote:

Maybe (all?) the clarification the docs need is to say:
"Partitioned tables are not *themselves* processed by autovacuum."

Yes, I think the lack of autovacuum needs to be specifically mentioned
since most people assume autovacuum handles _all_ statistics updating.

Can someone summarize how bad it is we have no statistics on partitioned
tables? It sounds bad to me.

Andrey Lepikhov had an example earlier in this thread[1]/messages/by-id/3df5c68b-13aa-53d0-c0ec-ed98e6972e2e@postgrespro.ru. It doesn't take
an exotic query.

Attached is a new version of my patch that tries to improve the wording.

Yours,
Laurenz Albe

#18

pryzby@telsasoft.com

over 3 years ago

In reply to: Laurenz Albe (#17)

Re: document the need to analyze partitioned tables

On Wed, Jan 18, 2023 at 10:15:18AM +0100, Laurenz Albe wrote:

On Tue, 2023-01-17 at 16:16 -0500, Bruce Momjian wrote:

On Tue, Jan 17, 2023 at 03:00:50PM -0600, Justin Pryzby wrote:

Maybe (all?) the clarification the docs need is to say:
"Partitioned tables are not *themselves* processed by autovacuum."

Yes, I think the lack of autovacuum needs to be specifically mentioned
since most people assume autovacuum handles _all_ statistics updating.

That's what 61fa6ca79 aimed to do. Laurenz is suggesting further
clarification.

Can someone summarize how bad it is we have no statistics on partitioned
tables?ï¿½ It sounds bad to me.

Andrey Lepikhov had an example earlier in this thread[1]. It doesn't take
an exotic query.

Attached is a new version of my patch that tries to improve the wording.

I tweaked this a bit to end up with:

-    Partitioned tables are not processed by autovacuum.  Statistics
-    should be collected by running a manual <command>ANALYZE</command> when it is
+    The leaf partitions of a partitioned table are normal tables and are processed
+    by autovacuum; however, autovacuum does not process the partitioned table itself.
+    This is no problem as far as <command>VACUUM</command> is concerned, since
+    there's no need to vacuum the empty, partitioned table.  But, as mentioned in
+    <xref linkend="vacuum-for-statistics"/>, it also means that autovacuum won't
+    run <command>ANALYZE</command> on the partitioned table.
+    Although statistics are automatically gathered on its leaf partitions, some queries also need
+    statistics on the partitioned table to run optimally.  You should collect statistics by
+    running a manual <command>ANALYZE</command> when the partitioned table is
first populated, and again whenever the distribution of data in its
partitions changes significantly.
</para>

"partitions are normal tables" was techically wrong, as partitions can
also be partitioned.

--
Justin

#19

bruce@momjian.us

over 3 years ago

In reply to: Justin Pryzby (#18)

Re: document the need to analyze partitioned tables

On Wed, Jan 18, 2023 at 11:49:19AM -0600, Justin Pryzby wrote:

On Wed, Jan 18, 2023 at 10:15:18AM +0100, Laurenz Albe wrote:

On Tue, 2023-01-17 at 16:16 -0500, Bruce Momjian wrote:

On Tue, Jan 17, 2023 at 03:00:50PM -0600, Justin Pryzby wrote:

Maybe (all?) the clarification the docs need is to say:
"Partitioned tables are not *themselves* processed by autovacuum."

Yes, I think the lack of autovacuum needs to be specifically mentioned
since most people assume autovacuum handles _all_ statistics updating.

That's what 61fa6ca79 aimed to do. Laurenz is suggesting further
clarification.

Ah, makes sense, thanks.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Embrace your flaws. They make you human, rather than perfect,
which you will never be.

#20

laurenz.albe@cybertec.at

over 3 years ago

In reply to: Justin Pryzby (#18)

Re: document the need to analyze partitioned tables

On Wed, 2023-01-18 at 11:49 -0600, Justin Pryzby wrote:

I tweaked this a bit to end up with:

-    Partitioned tables are not processed by autovacuum.  Statistics
-    should be collected by running a manual <command>ANALYZE</command> when it is
+    The leaf partitions of a partitioned table are normal tables and are processed
+    by autovacuum; however, autovacuum does not process the partitioned table itself.
+    This is no problem as far as <command>VACUUM</command> is concerned, since
+    there's no need to vacuum the empty, partitioned table.  But, as mentioned in
+    <xref linkend="vacuum-for-statistics"/>, it also means that autovacuum won't
+    run <command>ANALYZE</command> on the partitioned table.
+    Although statistics are automatically gathered on its leaf partitions, some queries also need
+    statistics on the partitioned table to run optimally.  You should collect statistics by
+    running a manual <command>ANALYZE</command> when the partitioned table is
     first populated, and again whenever the distribution of data in its
     partitions changes significantly.
    </para>

"partitions are normal tables" was techically wrong, as partitions can
also be partitioned.

I am fine with your tweaks. I think this is good to go.

Yours,
Laurenz Albe

#21

bruce@momjian.us

over 3 years ago

In reply to: Laurenz Albe (#17)

#22