Inconsistent use of relpages = -1
As Corey discovered, and I re-discovered later, partitioned tables can
have relpages=-1:
/messages/by-id/f4a0cf7975f1ad42a20fcc91be9e938a4f71259d.camel@j-davis.com
One problem is that the code (analyze.c:680) is a bit unclear because
it implicitly casts back and forth between signed and unsigned. If we
really mean InvalidBlockNumber, then we should use that instead of -1,
and then explicitly cast to an integer for storage in the catalog.
Another problem is that it's inconsistent: sometimes it's 0 (before
analyze) and sometimes -1. Views also don't have any storage, but
relpages are always 0.
And lastly, it's undocumented: if -1 is allowable, it should be in the
public docs for pg_class.
I don't see any obvious reason that -1 is better than 0, or any code
that checks for it, so I'm inclined to just use zero instead.
Thoughts?
Regards,
Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes:
As Corey discovered, and I re-discovered later, partitioned tables can
have relpages=-1:
...
Another problem is that it's inconsistent: sometimes it's 0 (before
analyze) and sometimes -1. Views also don't have any storage, but
relpages are always 0.
...
I don't see any obvious reason that -1 is better than 0, or any code
that checks for it, so I'm inclined to just use zero instead.
Hmm. relpages = 0 should mean that the relation is (thought to be)
of size zero, which is certainly true for the partitioned table itself,
except that that reasoning should also mean that reltuples is zero.
But we don't do that:
regression=# select relname, relkind, relpages,reltuples from pg_class where relpages < 0;
relname | relkind | relpages | reltuples
----------------+---------+----------+-----------
past_parted | p | -1 | 0
prt1 | p | -1 | 300
pagg_tab | p | -1 | 3000
prt2 | p | -1 | 200
pagg_tab1 | p | -1 | 150
pagg_tab2 | p | -1 | 100
prt1_e | p | -1 | 300
prt2_e | p | -1 | 200
pagg_tab_m | p | -1 | 3000
pagg_tab_ml | p | -1 | 30000
prt1_m | p | -1 | 300
prt2_m | p | -1 | 200
pagg_tab_ml_p2 | p | -1 | 8000
pagg_tab_ml_p3 | p | -1 | 10000
pagg_tab_para | p | -1 | 30000
plt1 | p | -1 | 300
plt2 | p | -1 | 200
plt1_e | p | -1 | 300
pht1 | p | -1 | 300
pht2 | p | -1 | 200
pht1_e | p | -1 | 150
prt1_l | p | -1 | 300
prt1_l_p2 | p | -1 | 125
prt1_l_p3 | p | -1 | 50
prt2_l | p | -1 | 200
prt2_l_p2 | p | -1 | 83
prt2_l_p3 | p | -1 | 33
prt1_n | p | -1 | 250
prt2_n | p | -1 | 300
prt3_n | p | -1 | 0
prt4_n | p | -1 | 300
alpha | p | -1 | 360
alpha_neg | p | -1 | 180
alpha_pos | p | -1 | 180
beta | p | -1 | 360
beta_neg | p | -1 | 180
beta_pos | p | -1 | 180
(37 rows)
If we are going to put data into reltuples but not relpages,
I think I agree with setting relpages to -1 to signify
"unknown" (analogously to -1 for reltuples). Otherwise it
looks like the table has infinite tuple density, which is
likely to bollix something somewhere.
I agree that not documenting this usage is bad, and not being
consistent about it is worse. But I don't think switching
over to the previously-minority case will improve matters.
regards, tom lane
On Fri, 2024-10-18 at 15:14 -0400, Tom Lane wrote:
If we are going to put data into reltuples but not relpages,
I think I agree with setting relpages to -1 to signify
"unknown" (analogously to -1 for reltuples). Otherwise it
looks like the table has infinite tuple density, which is
likely to bollix something somewhere.
That's a good point.
I attached a patch that creates partitioned tables with relpages=-1,
and updates the docs.
It's awkward to cast back and forth between BlockNumber and int32, so I
updated the signature of vac_update_relstats() as well.
Regards,
Jeff Davis
Attachments:
v1-0001-Be-more-consistent-about-relpages-for-partitioned.patchtext/x-patch; charset=UTF-8; name=v1-0001-Be-more-consistent-about-relpages-for-partitioned.patchDownload
From 34241c8a9e18c0e945dd38a2858676d52facc898 Mon Sep 17 00:00:00 2001
From: Jeff Davis <jeff@j-davis.com>
Date: Tue, 22 Oct 2024 10:04:11 -0700
Subject: [PATCH v1] Be more consistent about relpages for partitioned tables.
Use -1 for relpages, and document it.
Discussion: https://postgr.es/m/924396.1729278841@sss.pgh.pa.us
---
doc/src/sgml/catalogs.sgml | 3 ++-
src/backend/catalog/heap.c | 5 ++++-
src/backend/commands/vacuum.c | 9 +++++----
src/include/commands/vacuum.h | 2 +-
4 files changed, 12 insertions(+), 7 deletions(-)
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 964c819a02..19b7bbd4a8 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -2033,7 +2033,8 @@ SCRAM-SHA-256$<replaceable><iteration count></replaceable>:<replaceable>&l
<symbol>BLCKSZ</symbol>). This is only an estimate used by the
planner. It is updated by <link linkend="sql-vacuum"><command>VACUUM</command></link>,
<link linkend="sql-analyze"><command>ANALYZE</command></link>, and a few DDL commands such as
- <link linkend="sql-createindex"><command>CREATE INDEX</command></link>.
+ <link linkend="sql-createindex"><command>CREATE INDEX</command></link>. For partitioned
+ tables, <structfield>relpages</structfield> is <literal>-1</literal>.
</para></entry>
</row>
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index 0078a12f26..d21bf0b7da 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -987,7 +987,10 @@ AddNewRelationTuple(Relation pg_class_desc,
new_rel_reltup = new_rel_desc->rd_rel;
/* The relation is empty */
- new_rel_reltup->relpages = 0;
+ if (relkind == RELKIND_PARTITIONED_TABLE)
+ new_rel_reltup->relpages = -1;
+ else
+ new_rel_reltup->relpages = 0;
new_rel_reltup->reltuples = -1;
new_rel_reltup->relallvisible = 0;
diff --git a/src/backend/commands/vacuum.c b/src/backend/commands/vacuum.c
index ac8f5d9c25..eff65b7714 100644
--- a/src/backend/commands/vacuum.c
+++ b/src/backend/commands/vacuum.c
@@ -1402,13 +1402,14 @@ vac_estimate_reltuples(Relation relation,
* always allowable.
*
* Note: num_tuples should count only *live* tuples, since
- * pg_class.reltuples is defined that way.
+ * pg_class.reltuples is defined that way. Also, num_pages is signed,
+ * because partitioned tables have relpages of -1.
*
* This routine is shared by VACUUM and ANALYZE.
*/
void
vac_update_relstats(Relation relation,
- BlockNumber num_pages, double num_tuples,
+ int32 num_pages, double num_tuples,
BlockNumber num_all_visible_pages,
bool hasindex, TransactionId frozenxid,
MultiXactId minmulti,
@@ -1444,9 +1445,9 @@ vac_update_relstats(Relation relation,
/* Apply statistical updates, if any, to copied tuple */
dirty = false;
- if (pgcform->relpages != (int32) num_pages)
+ if (pgcform->relpages != num_pages)
{
- pgcform->relpages = (int32) num_pages;
+ pgcform->relpages = num_pages;
dirty = true;
}
if (pgcform->reltuples != (float4) num_tuples)
diff --git a/src/include/commands/vacuum.h b/src/include/commands/vacuum.h
index 759f9a87d3..ee3522ef8b 100644
--- a/src/include/commands/vacuum.h
+++ b/src/include/commands/vacuum.h
@@ -326,7 +326,7 @@ extern double vac_estimate_reltuples(Relation relation,
BlockNumber scanned_pages,
double scanned_tuples);
extern void vac_update_relstats(Relation relation,
- BlockNumber num_pages,
+ int32 num_pages,
double num_tuples,
BlockNumber num_all_visible_pages,
bool hasindex,
--
2.34.1
On Tue, 2024-10-22 at 10:41 -0700, Jeff Davis wrote:
I attached a patch that creates partitioned tables with relpages=-1,
and updates the docs.
Does this need any changes for pg_upgrade?
Yours,
Laurenz Albe
On Wed, 2024-10-23 at 04:47 +0200, Laurenz Albe wrote:
On Tue, 2024-10-22 at 10:41 -0700, Jeff Davis wrote:
I attached a patch that creates partitioned tables with relpages=-
1,
and updates the docs.Does this need any changes for pg_upgrade?
pg_upgrade would go through the same DDL path, so I expect it would be
set to -1 in the upgraded cluster anyway. Are you seeing something
different?
In any case, the change is just meant to improve consistency and
documentation. I didn't intend to create a hard guarantee that it would
always be -1, so perhaps I should make the documentation more vague
with "may be" instead of "is"?
It bothers me somewhat that views still have relpages=0, because they
also don't have storage. Thoughts?
Regards,
Jeff Davis
Jeff Davis <pgsql@j-davis.com> writes:
It bothers me somewhat that views still have relpages=0, because they
also don't have storage. Thoughts?
I think it's fine, because they also have reltuples=0. The
combination of those two zeroes indicates ignorance as to
the rel's contents. If either field is not zero, though, then
the combination ought to be sensible.
regards, tom lane
On Wed, 2024-10-23 at 10:05 -0700, Jeff Davis wrote:
On Wed, 2024-10-23 at 04:47 +0200, Laurenz Albe wrote:
On Tue, 2024-10-22 at 10:41 -0700, Jeff Davis wrote:
I attached a patch that creates partitioned tables with relpages=-
1,
and updates the docs.Does this need any changes for pg_upgrade?
pg_upgrade would go through the same DDL path, so I expect it would be
set to -1 in the upgraded cluster anyway. Are you seeing something
different?
No; I was too lazy to check, so I asked.
In any case, the change is just meant to improve consistency and
documentation. I didn't intend to create a hard guarantee that it would
always be -1, so perhaps I should make the documentation more vague
with "may be" instead of "is"?
Reading the thread an Tom's comments again, I get the impression that there
are two states that we consider OK:
- "relpages" and "reltuples" are both 0
- "relpages" is -1 and "reltuples" is greater than 0
What you write above indicates that "relpages" = 0 and "reltuples" > 0
would also be acceptable.
As long as the code does not mistakenly assume that a partitioned table
is empty, and as long as the documentation is not confusing, anything
is fine with me. Currently, I am still a bit confused.
Yours,
Laurenz Albe
On Thu, 2024-10-24 at 05:01 +0300, Laurenz Albe wrote:
What you write above indicates that "relpages" = 0 and "reltuples" >
0
would also be acceptable.
As Tom pointed out, that creates a risk that it's interpreted as
infinite tuple denisity.
The only functional change in my patch is to create the partitioned
table with relpages=-1 to be more consistent with the value after it's
analyzed.
Regards,
Jeff Davis
On Thu, 2024-10-24 at 08:03 -0700, Jeff Davis wrote:
On Thu, 2024-10-24 at 05:01 +0300, Laurenz Albe wrote:
What you write above indicates that "relpages" = 0 and "reltuples" >
0
would also be acceptable.As Tom pointed out, that creates a risk that it's interpreted as
infinite tuple denisity.The only functional change in my patch is to create the partitioned
table with relpages=-1 to be more consistent with the value after it's
analyzed.
Great, then it cannot be wrong. Sorry for the noise!
Yours,
Laurenz Albe