Don't treat virtual generated columns as missing statistics in vacuumdb --missing-stats-only
Hi,
I found that "vacuumdb --missing-stats-only" always performs ANALYZE
on tables with a virtual generated column, since such columns currently
never have statistics. This seems like an obvious waste, so I've attached
a patch to fix it, ensuring that virtual generated columns are not
regarded as missing statistics.
Regards,
Yugo Nagata
--
Yugo Nagata <nagata@sraoss.co.jp>
Attachments:
0001-Don-t-treat-virtual-generated-columns-as-missing-sta.patchtext/x-diff; name=0001-Don-t-treat-virtual-generated-columns-as-missing-sta.patchDownload
From 21d05a07657969032a1ab3c3b8f84edfc8551b86 Mon Sep 17 00:00:00 2001
From: Yugo Nagata <nagata@sraoss.co.jp>
Date: Wed, 20 Aug 2025 10:41:49 +0900
Subject: [PATCH] Don't treat virtual generated columns as missing statistics
in vacuumdb --missing-stats-only
---
src/bin/scripts/vacuumdb.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 79b1096eb08..71a58efff30 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -14,6 +14,7 @@
#include <limits.h>
+#include "catalog/pg_attribute_d.h"
#include "catalog/pg_class_d.h"
#include "common.h"
#include "common/connect.h"
@@ -957,6 +958,8 @@ retrieve_objects(PGconn *conn, vacuumingOptions *vacopts,
" AND a.attnum OPERATOR(pg_catalog.>) 0::pg_catalog.int2\n"
" AND NOT a.attisdropped\n"
" AND a.attstattarget IS DISTINCT FROM 0::pg_catalog.int2\n"
+ " AND a.attgenerated OPERATOR(pg_catalog.<>) "
+ CppAsString2(ATTRIBUTE_GENERATED_VIRTUAL) "\n"
" AND NOT EXISTS (SELECT NULL FROM pg_catalog.pg_statistic s\n"
" WHERE s.starelid OPERATOR(pg_catalog.=) a.attrelid\n"
" AND s.staattnum OPERATOR(pg_catalog.=) a.attnum\n"
--
2.43.0
On Wed, Aug 20, 2025 at 10:42 AM Yugo Nagata <nagata@sraoss.co.jp> wrote:
Hi,
I found that "vacuumdb --missing-stats-only" always performs ANALYZE
on tables with a virtual generated column, since such columns currently
never have statistics. This seems like an obvious waste, so I've attached
a patch to fix it, ensuring that virtual generated columns are not
regarded as missing statistics.
Thanks for the report and patch! This seems to be an oversight from
the commit that added virtual generated columns.
For the patch, shouldn't we also add a regression test for --missing-stats-only
with generated columns, to prevent this issue from happening again?
Regards,
--
Fujii Masao
On Wed, 20 Aug 2025 12:49:14 +0900
Fujii Masao <masao.fujii@gmail.com> wrote:
On Wed, Aug 20, 2025 at 10:42 AM Yugo Nagata <nagata@sraoss.co.jp> wrote:
Hi,
I found that "vacuumdb --missing-stats-only" always performs ANALYZE
on tables with a virtual generated column, since such columns currently
never have statistics. This seems like an obvious waste, so I've attached
a patch to fix it, ensuring that virtual generated columns are not
regarded as missing statistics.Thanks for the report and patch! This seems to be an oversight from
the commit that added virtual generated columns.For the patch, shouldn't we also add a regression test for --missing-stats-only
with generated columns, to prevent this issue from happening again?
Thank you for reviewing the patch and your suggestion.
I agree that we should add a test, since the behavior may change in the future
when statistics begin to be collected for virtual generated columns, and the test
will serve as a reminder when this behavior changes.
I've attached a updated patch including the test.
Regards,
Yugo Nagata
--
Yugo Nagata <nagata@sraoss.co.jp>
Attachments:
v2-0001-Don-t-treat-virtual-generated-columns-as-missing-.patchtext/x-diff; name=v2-0001-Don-t-treat-virtual-generated-columns-as-missing-.patchDownload
From 8a4bc5cd58c16c6a8950d17ccd63b276cdd2c9b9 Mon Sep 17 00:00:00 2001
From: Yugo Nagata <nagata@sraoss.co.jp>
Date: Wed, 20 Aug 2025 10:41:49 +0900
Subject: [PATCH v2] Don't treat virtual generated columns as missing
statistics in vacuumdb --missing-stats-only
---
src/bin/scripts/t/100_vacuumdb.pl | 12 ++++++++++++
src/bin/scripts/vacuumdb.c | 3 +++
2 files changed, 15 insertions(+)
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index ff56a13b46b..c81f9c6b490 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -340,4 +340,16 @@ $node->issues_sql_unlike(
qr/statement:\ ANALYZE/sx,
'--missing-stats-only with no missing partition stats');
+$node->safe_psql('postgres',
+ 'ALTER TABLE regression_vacuumdb_test ADD COLUMN c int GENERATED ALWAYS AS (a+b)'
+);
+$node->issues_sql_unlike(
+ [
+ 'vacuumdb', '--analyze-only',
+ '--missing-stats-only', '-t',
+ 'regression_vacuumdb_parted', 'postgres'
+ ],
+ qr/statement:\ ANALYZE/sx,
+ '--missing-stats-only with virtual generated column');
+
done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 79b1096eb08..71a58efff30 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -14,6 +14,7 @@
#include <limits.h>
+#include "catalog/pg_attribute_d.h"
#include "catalog/pg_class_d.h"
#include "common.h"
#include "common/connect.h"
@@ -957,6 +958,8 @@ retrieve_objects(PGconn *conn, vacuumingOptions *vacopts,
" AND a.attnum OPERATOR(pg_catalog.>) 0::pg_catalog.int2\n"
" AND NOT a.attisdropped\n"
" AND a.attstattarget IS DISTINCT FROM 0::pg_catalog.int2\n"
+ " AND a.attgenerated OPERATOR(pg_catalog.<>) "
+ CppAsString2(ATTRIBUTE_GENERATED_VIRTUAL) "\n"
" AND NOT EXISTS (SELECT NULL FROM pg_catalog.pg_statistic s\n"
" WHERE s.starelid OPERATOR(pg_catalog.=) a.attrelid\n"
" AND s.staattnum OPERATOR(pg_catalog.=) a.attnum\n"
--
2.43.0
On Wed, 20 Aug 2025 13:30:12 +0900
Yugo Nagata <nagata@sraoss.co.jp> wrote:
On Wed, 20 Aug 2025 12:49:14 +0900
Fujii Masao <masao.fujii@gmail.com> wrote:On Wed, Aug 20, 2025 at 10:42 AM Yugo Nagata <nagata@sraoss.co.jp> wrote:
Hi,
I found that "vacuumdb --missing-stats-only" always performs ANALYZE
on tables with a virtual generated column, since such columns currently
never have statistics. This seems like an obvious waste, so I've attached
a patch to fix it, ensuring that virtual generated columns are not
regarded as missing statistics.Thanks for the report and patch! This seems to be an oversight from
the commit that added virtual generated columns.For the patch, shouldn't we also add a regression test for --missing-stats-only
with generated columns, to prevent this issue from happening again?Thank you for reviewing the patch and your suggestion.
I agree that we should add a test, since the behavior may change in the future
when statistics begin to be collected for virtual generated columns, and the test
will serve as a reminder when this behavior changes.I've attached a updated patch including the test.
The patch conflicted with the latest commit, so I rebased it.
Regards,
Yugo Nagata
--
Yugo Nagata <nagata@sraoss.co.jp>
Attachments:
v3-0001-Avoid-treating-virtual-generated-columns-as-missi.patchtext/x-diff; name=v3-0001-Avoid-treating-virtual-generated-columns-as-missi.patchDownload
From 9355bca773f306c8012cb40c52a5ef769d142e7e Mon Sep 17 00:00:00 2001
From: Yugo Nagata <nagata@sraoss.co.jp>
Date: Wed, 20 Aug 2025 10:41:49 +0900
Subject: [PATCH v3] Avoid treating virtual generated columns as missing
statistics in vacuumdb --missing-stats-only
Previously, vacuumdb --missing-stats-only always ran ANALYZE on tables
with virtual generated columns, since such columns never have statistics.
This was unnecessary, so virtual generated columns are no longer regarded
as missing statistics.
Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/20250820104226.8ba51e43164cd590b863ce41%40sraoss.co.jp
---
src/bin/scripts/t/100_vacuumdb.pl | 12 ++++++++++++
src/bin/scripts/vacuumdb.c | 3 +++
2 files changed, 15 insertions(+)
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 240f0fdd3e5..c9ed7463008 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -351,4 +351,16 @@ $node->issues_sql_like(
qr/statement: ANALYZE public.parent_table/s,
'--analyze-only updates statistics for partitioned tables');
+$node->safe_psql('postgres',
+ 'ALTER TABLE regression_vacuumdb_test ADD COLUMN c int GENERATED ALWAYS AS (a+b)'
+);
+$node->issues_sql_unlike(
+ [
+ 'vacuumdb', '--analyze-only',
+ '--missing-stats-only', '-t',
+ 'regression_vacuumdb_parted', 'postgres'
+ ],
+ qr/statement:\ ANALYZE/sx,
+ '--missing-stats-only with virtual generated column');
+
done_testing();
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 22093e50aa5..5a8ec959b06 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -14,6 +14,7 @@
#include <limits.h>
+#include "catalog/pg_attribute_d.h"
#include "catalog/pg_class_d.h"
#include "common.h"
#include "common/connect.h"
@@ -973,6 +974,8 @@ retrieve_objects(PGconn *conn, vacuumingOptions *vacopts,
" AND a.attnum OPERATOR(pg_catalog.>) 0::pg_catalog.int2\n"
" AND NOT a.attisdropped\n"
" AND a.attstattarget IS DISTINCT FROM 0::pg_catalog.int2\n"
+ " AND a.attgenerated OPERATOR(pg_catalog.<>) "
+ CppAsString2(ATTRIBUTE_GENERATED_VIRTUAL) "\n"
" AND NOT EXISTS (SELECT NULL FROM pg_catalog.pg_statistic s\n"
" WHERE s.starelid OPERATOR(pg_catalog.=) a.attrelid\n"
" AND s.staattnum OPERATOR(pg_catalog.=) a.attnum\n"
--
2.43.0
On Wed, Aug 20, 2025 at 01:53:38PM +0900, Yugo Nagata wrote:
The patch conflicted with the latest commit, so I rebased it.
Nice find. I would suggest adding the virtual generated column to
regression_vacuumdb_test when it is first created so that we can just rely
on the existing test cases. In fact, by doing so, you'll see that we need
a similar change to the "inheritance and regular stats" part of the query.
--
nathan
On Thu, Aug 21, 2025 at 4:19 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
On Wed, Aug 20, 2025 at 01:53:38PM +0900, Yugo Nagata wrote:
The patch conflicted with the latest commit, so I rebased it.
Nice find. I would suggest adding the virtual generated column to
regression_vacuumdb_test when it is first created so that we can just rely
on the existing test cases. In fact, by doing so, you'll see that we need
a similar change to the "inheritance and regular stats" part of the query.
+1
Regards,
--
Fujii Masao
On Thu, Aug 21, 2025 at 11:13:44AM +0900, Fujii Masao wrote:
On Thu, Aug 21, 2025 at 4:19 AM Nathan Bossart <nathandbossart@gmail.com> wrote:
Nice find. I would suggest adding the virtual generated column to
regression_vacuumdb_test when it is first created so that we can just rely
on the existing test cases. In fact, by doing so, you'll see that we need
a similar change to the "inheritance and regular stats" part of the query.+1
Since we're running out of time for v18, I went ahead and updated the
patch. I've also added an open item for this.
--
nathan
Attachments:
v4-0001-vacuumdb-Fix-missing-stats-only-with-virtual-gene.patchtext/plain; charset=us-asciiDownload
From d6edeada7f0670eb18717ee7df5841bd80c927b4 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Thu, 21 Aug 2025 08:45:50 -0500
Subject: [PATCH v4 1/1] vacuumdb: Fix --missing-stats-only with virtual
generated columns.
Author: Yugo Nagata <nagata@sraoss.co.jp>
Reviewed-by: Fujii Masao <masao.fujii@gmail.com>
Discussion: https://postgr.es/m/20250820104226.8ba51e43164cd590b863ce41%40sraoss.co.jp
---
src/bin/scripts/t/100_vacuumdb.pl | 7 ++++---
src/bin/scripts/vacuumdb.c | 5 +++++
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/src/bin/scripts/t/100_vacuumdb.pl b/src/bin/scripts/t/100_vacuumdb.pl
index 240f0fdd3e5..945c30df156 100644
--- a/src/bin/scripts/t/100_vacuumdb.pl
+++ b/src/bin/scripts/t/100_vacuumdb.pl
@@ -237,9 +237,10 @@ $node->command_fails_like(
qr/cannot vacuum all databases and a specific one at the same time/,
'cannot use option --all and a dbname as argument at the same time');
-$node->safe_psql('postgres',
- 'CREATE TABLE regression_vacuumdb_test AS select generate_series(1, 10) a, generate_series(2, 11) b;'
-);
+$node->safe_psql('postgres', q|
+ CREATE TABLE regression_vacuumdb_test AS select generate_series(1, 10) a, generate_series(2, 11) b;
+ ALTER TABLE regression_vacuumdb_test ADD COLUMN c INT GENERATED ALWAYS AS (a + b);
+|);
$node->issues_sql_like(
[
'vacuumdb', '--analyze-only',
diff --git a/src/bin/scripts/vacuumdb.c b/src/bin/scripts/vacuumdb.c
index 22093e50aa5..fd236087e90 100644
--- a/src/bin/scripts/vacuumdb.c
+++ b/src/bin/scripts/vacuumdb.c
@@ -14,6 +14,7 @@
#include <limits.h>
+#include "catalog/pg_attribute_d.h"
#include "catalog/pg_class_d.h"
#include "common.h"
#include "common/connect.h"
@@ -973,6 +974,8 @@ retrieve_objects(PGconn *conn, vacuumingOptions *vacopts,
" AND a.attnum OPERATOR(pg_catalog.>) 0::pg_catalog.int2\n"
" AND NOT a.attisdropped\n"
" AND a.attstattarget IS DISTINCT FROM 0::pg_catalog.int2\n"
+ " AND a.attgenerated OPERATOR(pg_catalog.<>) "
+ CppAsString2(ATTRIBUTE_GENERATED_VIRTUAL) "\n"
" AND NOT EXISTS (SELECT NULL FROM pg_catalog.pg_statistic s\n"
" WHERE s.starelid OPERATOR(pg_catalog.=) a.attrelid\n"
" AND s.staattnum OPERATOR(pg_catalog.=) a.attnum\n"
@@ -1010,6 +1013,8 @@ retrieve_objects(PGconn *conn, vacuumingOptions *vacopts,
" AND a.attnum OPERATOR(pg_catalog.>) 0::pg_catalog.int2\n"
" AND NOT a.attisdropped\n"
" AND a.attstattarget IS DISTINCT FROM 0::pg_catalog.int2\n"
+ " AND a.attgenerated OPERATOR(pg_catalog.<>) "
+ CppAsString2(ATTRIBUTE_GENERATED_VIRTUAL) "\n"
" AND c.relhassubclass\n"
" AND NOT p.inherited\n"
" AND EXISTS (SELECT NULL FROM pg_catalog.pg_inherits h\n"
--
2.39.5 (Apple Git-154)
On Thu, Aug 21, 2025 at 9:56 AM Nathan Bossart <nathandbossart@gmail.com>
wrote:
On Thu, Aug 21, 2025 at 11:13:44AM +0900, Fujii Masao wrote:
On Thu, Aug 21, 2025 at 4:19 AM Nathan Bossart <nathandbossart@gmail.com>
wrote:
Nice find. I would suggest adding the virtual generated column to
regression_vacuumdb_test when it is first created so that we can justrely
on the existing test cases. In fact, by doing so, you'll see that we
need
a similar change to the "inheritance and regular stats" part of the
query.
+1
Since we're running out of time for v18, I went ahead and updated the
patch. I've also added an open item for this.
+1 on the fix, works for me.
On Thu, Aug 21, 2025 at 11:29:56AM -0400, Corey Huinker wrote:
On Thu, Aug 21, 2025 at 9:56 AM Nathan Bossart <nathandbossart@gmail.com>
wrote:Since we're running out of time for v18, I went ahead and updated the
patch. I've also added an open item for this.+1 on the fix, works for me.
Thanks. I'll plan on committing this in roughly 24 hours (barring
additional feedback).
--
nathan
On Thu, 21 Aug 2025 10:37:10 -0500
Nathan Bossart <nathandbossart@gmail.com> wrote:
On Thu, Aug 21, 2025 at 11:29:56AM -0400, Corey Huinker wrote:
On Thu, Aug 21, 2025 at 9:56 AM Nathan Bossart <nathandbossart@gmail.com>
wrote:Since we're running out of time for v18, I went ahead and updated the
patch. I've also added an open item for this.+1 on the fix, works for me.
Thanks. I'll plan on committing this in roughly 24 hours (barring
additional feedback).
Thank you for updating the patch.
I'm fine with your fixes, both in the test and in the additional change
for inheritance tables. I had overlooked the latter one.
Regards,
Yugo Nagata
--
Yugo Nagata <nagata@sraoss.co.jp>
On Fri, Aug 22, 2025 at 2:23 AM Yugo Nagata <nagata@sraoss.co.jp> wrote:
On Thu, 21 Aug 2025 10:37:10 -0500
Nathan Bossart <nathandbossart@gmail.com> wrote:On Thu, Aug 21, 2025 at 11:29:56AM -0400, Corey Huinker wrote:
On Thu, Aug 21, 2025 at 9:56 AM Nathan Bossart <nathandbossart@gmail.com>
wrote:Since we're running out of time for v18, I went ahead and updated the
patch. I've also added an open item for this.+1 on the fix, works for me.
Thanks. I'll plan on committing this in roughly 24 hours (barring
additional feedback).
Thanks a lot!
Thank you for updating the patch.
I'm fine with your fixes, both in the test and in the additional change
for inheritance tables. I had overlooked the latter one.
The patch looks good to me, too.
At first, I just wondered whether vacuumdb --missing-stats-only would work
on older servers, since the patch adds access to the attgenerated column,
which might not exist there. But that's not an issue, because
--missing-stats-only is supported only on servers version 15 or later,
where attgenerated is available.
Regards,
--
Fujii Masao
On Fri, Aug 22, 2025 at 10:19:32AM +0900, Fujii Masao wrote:
At first, I just wondered whether vacuumdb --missing-stats-only would work
on older servers, since the patch adds access to the attgenerated column,
which might not exist there. But that's not an issue, because
--missing-stats-only is supported only on servers version 15 or later,
where attgenerated is available.
Yeah, this crossed my mind, too, and I reached the same conclusion.
--
nathan