Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

Started by Ayush Vatsaover 1 year ago29 messages
#1Ayush Vatsa
ayushvatsa1810@gmail.com

Hi PostgreSQL Community,
I have encountered an issue when attempting to use pgstattuple extension
with sequences. When executing the following command:

SELECT * FROM pgstattuple('serial');
ERROR: only heap AM is supported

This behaviour is observed in PostgreSQL versions post v11 [1]* Reference to Earlier Discussion:* For additional context, I previously discussed this issue on the pgsql-general mailing list. You can find the discussion /messages/by-id/CACX+KaMOd3HHteOJNX7fkWxO+R=uLJkfKqE2-QUK8fKmKfOwqw@mail.gmail.com. In that thread, it was suggested that this could be considered a documentation bug, and that we might update the documentation and regression tests accordingly. , where
sequences support in pgstattuple used to work fine. However, this issue
slipped through as we did not have any test cases to catch it.

Given the situation, I see two potential paths forward:
*1/ Reintroduce Support for Sequences in pgstattuple*: This would be a
relatively small change. However, it's important to note that the purpose
of pgstattuple is to provide statistics like the number of tuples, dead
tuples, and free space in a relation. Sequences, on the other hand, return
only one value at a time and don’t have attributes like dead tuples.
Therefore, the result for any sequence would consistently look something
like this:

SELECT * FROM pgstattuple('serial');
table_len | tuple_count | tuple_len | tuple_percent |
dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space |
free_percent
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
8192 | 1 | 41 | 0.5 |
0 | 0 | 0 | 8104 | 98.93
(1 row)

*2/ Explicitly Block Sequence Support in pgstattuple*: We could align
sequences with other unsupported objects, such as foreign tables, by
providing a more explicit error message. For instance:

SELECT * FROM pgstattuple('x');
ERROR: cannot get tuple-level statistics for relation "x"
DETAIL: This operation is not supported for foreign tables.

This approach would ensure that the error handling for sequences is
consistent with how other unsupported objects are handled.
Personally, I lean towards the second approach, as it promotes consistency
and clarity. However, I would greatly appreciate the community's feedback
and suggestions on the best way to proceed.
Based on the feedback received, I will work on the appropriate patch.

Looking forward to your comments and feedback.

[1]: * Reference to Earlier Discussion:* For additional context, I previously discussed this issue on the pgsql-general mailing list. You can find the discussion /messages/by-id/CACX+KaMOd3HHteOJNX7fkWxO+R=uLJkfKqE2-QUK8fKmKfOwqw@mail.gmail.com. In that thread, it was suggested that this could be considered a documentation bug, and that we might update the documentation and regression tests accordingly.
discussed this issue on the pgsql-general mailing list. You can find the
discussion
/messages/by-id/CACX+KaMOd3HHteOJNX7fkWxO+R=uLJkfKqE2-QUK8fKmKfOwqw@mail.gmail.com.
In that thread, it was suggested that this could be considered a
documentation bug, and that we might update the documentation and
regression tests accordingly.

Regards
Ayush Vatsa
AWS

#2Robert Haas
robertmhaas@gmail.com
In reply to: Ayush Vatsa (#1)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Mon, Aug 26, 2024 at 11:44 AM Ayush Vatsa <ayushvatsa1810@gmail.com> wrote:

Hi PostgreSQL Community,
I have encountered an issue when attempting to use pgstattuple extension with sequences. When executing the following command:

SELECT * FROM pgstattuple('serial');
ERROR: only heap AM is supported

This behaviour is observed in PostgreSQL versions post v11 [1] , where sequences support in pgstattuple used to work fine. However, this issue slipped through as we did not have any test cases to catch it.

Given the situation, I see two potential paths forward:
1/ Reintroduce Support for Sequences in pgstattuple: This would be a relatively small change. However, it's important to note that the purpose of pgstattuple is to provide statistics like the number of tuples, dead tuples, and free space in a relation. Sequences, on the other hand, return only one value at a time and don’t have attributes like dead tuples. Therefore, the result for any sequence would consistently look something like this:

SELECT * FROM pgstattuple('serial');
table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent
-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
8192 | 1 | 41 | 0.5 | 0 | 0 | 0 | 8104 | 98.93
(1 row)

2/ Explicitly Block Sequence Support in pgstattuple: We could align sequences with other unsupported objects, such as foreign tables, by providing a more explicit error message. For instance:

SELECT * FROM pgstattuple('x');
ERROR: cannot get tuple-level statistics for relation "x"
DETAIL: This operation is not supported for foreign tables.

This approach would ensure that the error handling for sequences is consistent with how other unsupported objects are handled.
Personally, I lean towards the second approach, as it promotes consistency and clarity. However, I would greatly appreciate the community's feedback and suggestions on the best way to proceed.
Based on the feedback received, I will work on the appropriate patch.

Looking forward to your comments and feedback.

I don't really see what the problem is here. You state that the
information pgstattuple provides isn't really useful for sequences, so
that means there's no real reason to do (1). As for (2), I'm not
opposed to improving error messages but it's not clear to me why you
think that the current one is bad. You say that we should provide a
more explicit error message, but "only heap AM is supported" seems
pretty explicit to me: it doesn't spell out that this only works for
relkind='r', but since relam=heap is only possible for relkind='r',
there's not really any other reasonable interpretation, which IMHO
makes this pretty specific about what the problem is. Maybe you just
find it confusing, but that's a bit different from whether it's
explicit enough.

--
Robert Haas
EDB: http://www.enterprisedb.com

#3Nathan Bossart
nathandbossart@gmail.com
In reply to: Ayush Vatsa (#1)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Mon, Aug 26, 2024 at 09:14:27PM +0530, Ayush Vatsa wrote:

Given the situation, I see two potential paths forward:
*1/ Reintroduce Support for Sequences in pgstattuple*: This would be a
relatively small change. However, it's important to note that the purpose
of pgstattuple is to provide statistics like the number of tuples, dead
tuples, and free space in a relation. Sequences, on the other hand, return
only one value at a time and don�t have attributes like dead tuples.

[...]

*2/ Explicitly Block Sequence Support in pgstattuple*: We could align
sequences with other unsupported objects, such as foreign tables, by
providing a more explicit error message.

While it is apparently pretty uncommon to use pgstattuple on sequences,
this is arguably a bug that should be fixed and back-patched. I've CC'd
Michael Paquier, who is working on sequence AMs and may have thoughts. I
haven't looked at his patch set for that, but I'm assuming that it might
fill in pg_class.relam for sequences, which would have the same effect as
option 1.

I see a couple of other places we might want to look into as part of this
thread. Besides pgstattuple, I see that pageinspect and pg_surgery follow
a similar pattern. pgrowlocks does, too, but that one seems intentionally
limited to RELKIND_RELATION. I also see that amcheck explicitly allows
sequences:

/*
* Sequences always use heap AM, but they don't show that in the catalogs.
* Other relkinds might be using a different AM, so check.
*/
if (ctx.rel->rd_rel->relkind != RELKIND_SEQUENCE &&
ctx.rel->rd_rel->relam != HEAP_TABLE_AM_OID)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("only heap AM is supported")));

IMHO it would be good to establish some level of consistency here.

--
nathan

#4Robert Haas
robertmhaas@gmail.com
In reply to: Nathan Bossart (#3)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Mon, Aug 26, 2024 at 1:26 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

While it is apparently pretty uncommon to use pgstattuple on sequences,
this is arguably a bug that should be fixed and back-patched.

I don't understand what would make it a bug.

IMHO it would be good to establish some level of consistency here.

Sure, consistency is good, all other things being equal, but just
saying "well this used to work one way and now it works another way"
isn't enough to say that there is a bug, or that something should be
changed.

--
Robert Haas
EDB: http://www.enterprisedb.com

#5Ayush Vatsa
ayushvatsa1810@gmail.com
In reply to: Robert Haas (#4)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

You state that the
information pgstattuple provides isn't really useful for sequences, so
that means there's no real reason to do (1)

That's correct, but we should consider that up until v11,
sequences were supported in pgstattuple. Their support
was removed unintentionally (I believe so). Therefore, it might be worth
discussing whether it makes sense to reinstate support for sequences.

why you think that the current one is bad

The current implementation has some drawbacks.
For instance, when encountering other unsupported objects, the error looks
like this:
ERROR: cannot get tuple-level statistics for relation "x"
DETAIL: This operation is not supported for foreign tables.

However, for sequences, the message should explicitly
state that "This operation is not supported for sequences."

Currently, we're deducing that the heap access method (AM) is
for relkind='r', so the message "only heap AM is supported" implies
that only relkind='r' are supported.
This prompted my thoughts on the matter.

Moreover, if you refer to the code in pgstattuple.c
<https://github.com/postgres/postgres/blob/master/contrib/pgstattuple/pgstattuple.c#L255-L256&gt;
,
you'll notice that sequences appear to be explicitly allowed in
pgstattuple,
but it results in an error encountered here -
https://github.com/postgres/postgres/blob/master/contrib/pgstattuple/pgstattuple.c#L326-L329
Therefore, I believe a small refactoring is needed to make the code cleaner
and more consistent.

IMHO it would be good to establish some level of consistency here.

Agree.

Let me know your thoughts.

Regards
Ayush Vatsa
AWS

#6Nathan Bossart
nathandbossart@gmail.com
In reply to: Robert Haas (#4)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Mon, Aug 26, 2024 at 01:35:52PM -0400, Robert Haas wrote:

On Mon, Aug 26, 2024 at 1:26 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

While it is apparently pretty uncommon to use pgstattuple on sequences,
this is arguably a bug that should be fixed and back-patched.

I don't understand what would make it a bug.

IMHO it would be good to establish some level of consistency here.

Sure, consistency is good, all other things being equal, but just
saying "well this used to work one way and now it works another way"
isn't enough to say that there is a bug, or that something should be
changed.

The reason I think it's arguably a bug is because it used to work fine and
then started ERROR-ing after commit 4b82664. I'm fine with saying that we
don't think it's useful and intentionally deprecating it, but AFAICT no
such determination has been made. I see no discussion about this on the
thread for commit 4b82664, and the only caller of pgstat_heap()
intentionally calls into the affected function for sequences (and has since
pgstattuple was introduced 18 years ago):

if (RELKIND_HAS_TABLE_AM(rel->rd_rel->relkind) ||
rel->rd_rel->relkind == RELKIND_SEQUENCE)
{
return pgstat_heap(rel, fcinfo);
}

--
nathan

#7Ayush Vatsa
ayushvatsa1810@gmail.com
In reply to: Nathan Bossart (#6)
1 attachment(s)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

Hi all,
Please find attached the patch that re-enables
support for sequences within the pgstattuple extension.
I have also included the necessary test cases for
sequences, implemented in the form of regress tests.

Here is the commitfest link for the same -
https://commitfest.postgresql.org/49/5215/

Regards
Ayush Vatsa
AWS

Attachments:

v1-0001-Introduced-support-for-sequences-back-in-pgstattu.patchapplication/x-patch; name=v1-0001-Introduced-support-for-sequences-back-in-pgstattu.patchDownload
From 95ab0aa019e1cfec73bc94448faeafeac67b434e Mon Sep 17 00:00:00 2001
From: Ayush Vatsa <ayuvatsa@amazon.com>
Date: Thu, 29 Aug 2024 21:40:29 +0530
Subject: [PATCH v1] Introduced support for sequences back in pgstattuple
 extension

---
 contrib/pgstattuple/expected/pgstattuple.out | 24 ++++++++++++++++++++
 contrib/pgstattuple/pgstattuple.c            |  6 ++++-
 contrib/pgstattuple/sql/pgstattuple.sql      | 10 ++++++++
 3 files changed, 39 insertions(+), 1 deletion(-)

diff --git a/contrib/pgstattuple/expected/pgstattuple.out b/contrib/pgstattuple/expected/pgstattuple.out
index 283856e109..1e79fd9036 100644
--- a/contrib/pgstattuple/expected/pgstattuple.out
+++ b/contrib/pgstattuple/expected/pgstattuple.out
@@ -273,6 +273,30 @@ select pgstathashindex('test_partition_hash_idx');
  (4,8,0,1,0,0,0,100)
 (1 row)
 
+-- test on sequence
+-- only pgstattuple and pg_relpages should work
+create sequence serial;
+select * from pgstattuple('serial');
+ table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
+-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
+      8192 |           1 |        41 |           0.5 |                0 |              0 |                  0 |       8104 |        98.93
+(1 row)
+
+select pgstatindex('serial');
+ERROR:  relation "serial" is not a btree index
+select pgstatginindex('serial');
+ERROR:  relation "serial" is not a GIN index
+select pgstathashindex('serial');
+ERROR:  relation "serial" is not a hash index
+select pg_relpages('serial');
+ pg_relpages 
+-------------
+           1
+(1 row)
+
+select * from pgstattuple_approx('serial');
+ERROR:  relation "serial" is of wrong relation kind
+DETAIL:  This operation is not supported for sequences.
 drop table test_partitioned;
 drop view test_view;
 drop foreign table test_foreign_table;
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 3bd8b96197..13c70c4152 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -323,7 +323,11 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 	pgstattuple_type stat = {0};
 	SnapshotData SnapshotDirty;
 
-	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+	/**
+	 * Sequences don't fall under heap AM but are still
+	 * allowed for obtaining tuple-level statistics.
+	 */
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID && rel->rd_rel->relkind != RELKIND_SEQUENCE)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("only heap AM is supported")));
diff --git a/contrib/pgstattuple/sql/pgstattuple.sql b/contrib/pgstattuple/sql/pgstattuple.sql
index b08c31c21b..ea121df0c7 100644
--- a/contrib/pgstattuple/sql/pgstattuple.sql
+++ b/contrib/pgstattuple/sql/pgstattuple.sql
@@ -119,6 +119,16 @@ create index test_partition_hash_idx on test_partition using hash (a);
 select pgstatindex('test_partition_idx');
 select pgstathashindex('test_partition_hash_idx');
 
+-- test on sequence
+-- only pgstattuple and pg_relpages should work
+create sequence serial;
+select * from pgstattuple('serial');
+select pgstatindex('serial');
+select pgstatginindex('serial');
+select pgstathashindex('serial');
+select pg_relpages('serial');
+select * from pgstattuple_approx('serial');
+
 drop table test_partitioned;
 drop view test_view;
 drop foreign table test_foreign_table;
-- 
2.41.0

#8Nathan Bossart
nathandbossart@gmail.com
In reply to: Ayush Vatsa (#7)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Thu, Aug 29, 2024 at 10:17:57PM +0530, Ayush Vatsa wrote:

Please find attached the patch that re-enables
support for sequences within the pgstattuple extension.
I have also included the necessary test cases for
sequences, implemented in the form of regress tests.

Thanks. Robert, do you have any concerns with this?

+select * from pgstattuple('serial');
+ table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
+-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
+      8192 |           1 |        41 |           0.5 |                0 |              0 |                  0 |       8104 |        98.93
+(1 row)

I'm concerned that some of this might be platform-dependent and make the
test unstable. Perhaps we should just select count(*) here.

+	/**
+	 * Sequences don't fall under heap AM but are still
+	 * allowed for obtaining tuple-level statistics.
+	 */

I think we should be a bit more descriptive here, like the comment in
verify_heapam.c:

/*
* Sequences always use heap AM, but they don't show that in the catalogs.
* Other relkinds might be using a different AM, so check.
*/

--
nathan

#9Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#8)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Thu, Aug 29, 2024 at 12:36:35PM -0500, Nathan Bossart wrote:

+select * from pgstattuple('serial');
+ table_len | tuple_count | tuple_len | tuple_percent | dead_tuple_count | dead_tuple_len | dead_tuple_percent | free_space | free_percent 
+-----------+-------------+-----------+---------------+------------------+----------------+--------------------+------------+--------------
+      8192 |           1 |        41 |           0.5 |                0 |              0 |                  0 |       8104 |        98.93
+(1 row)

I'm concerned that some of this might be platform-dependent and make the
test unstable. Perhaps we should just select count(*) here.

Sure enough, the CI testing for 32-bit is failing here [0]https://api.cirrus-ci.com/v1/artifact/task/4798423386292224/testrun/build-32/testrun/pgstattuple/regress/regression.diffs.

[0]: https://api.cirrus-ci.com/v1/artifact/task/4798423386292224/testrun/build-32/testrun/pgstattuple/regress/regression.diffs

--
nathan

#10Ayush Vatsa
ayushvatsa1810@gmail.com
In reply to: Nathan Bossart (#9)
1 attachment(s)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

I'm concerned that some of this might be platform-dependent and make the
test unstable. Perhaps we should just select count(*) here.
Sure enough, the CI testing for 32-bit is failing here [0].

Thanks for catching that! I wasn't aware of this earlier.

I think we should be a bit more descriptive here

Regarding the comment, I've tried to make it more
descriptive and simpler than the existing one in
verify_heapam.c. Here’s the comment I propose:

/*
* Sequences implicitly use the heap AM, even though it's not explicitly
* recorded in the catalogs. For other relation kinds, verify that the AM
* is heap; otherwise, raise an error.
*/

Please let me know if this still isn’t clear enough,
then I can make further revisions in line with verify_heapam.c.

The patch with all the changes is attached.

Regards
Ayush Vatsa
AWS

Attachments:

v1-0001-Introduced-support-for-sequences-back-in-pgstattu.patchapplication/octet-stream; name=v1-0001-Introduced-support-for-sequences-back-in-pgstattu.patchDownload
From d51682891d1a98c565850c68906d7e2f049e9c65 Mon Sep 17 00:00:00 2001
From: Ayush Vatsa <ayuvatsa@amazon.com>
Date: Thu, 29 Aug 2024 21:40:29 +0530
Subject: [PATCH v1] Introduced support for sequences back in pgstattuple
 extension

---
 contrib/pgstattuple/expected/pgstattuple.out | 24 ++++++++++++++++++++
 contrib/pgstattuple/pgstattuple.c            |  8 ++++++-
 contrib/pgstattuple/sql/pgstattuple.sql      | 10 ++++++++
 3 files changed, 41 insertions(+), 1 deletion(-)

diff --git a/contrib/pgstattuple/expected/pgstattuple.out b/contrib/pgstattuple/expected/pgstattuple.out
index 283856e109..1e32370830 100644
--- a/contrib/pgstattuple/expected/pgstattuple.out
+++ b/contrib/pgstattuple/expected/pgstattuple.out
@@ -273,6 +273,30 @@ select pgstathashindex('test_partition_hash_idx');
  (4,8,0,1,0,0,0,100)
 (1 row)
 
+-- test on sequence
+-- only pgstattuple and pg_relpages should work
+create sequence serial;
+select count(*) from pgstattuple('serial');
+ count 
+-------
+     1
+(1 row)
+
+select pgstatindex('serial');
+ERROR:  relation "serial" is not a btree index
+select pgstatginindex('serial');
+ERROR:  relation "serial" is not a GIN index
+select pgstathashindex('serial');
+ERROR:  relation "serial" is not a hash index
+select pg_relpages('serial');
+ pg_relpages 
+-------------
+           1
+(1 row)
+
+select count(*) from pgstattuple_approx('serial');
+ERROR:  relation "serial" is of wrong relation kind
+DETAIL:  This operation is not supported for sequences.
 drop table test_partitioned;
 drop view test_view;
 drop foreign table test_foreign_table;
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 3bd8b96197..d4fc891877 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -323,7 +323,13 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 	pgstattuple_type stat = {0};
 	SnapshotData SnapshotDirty;
 
-	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+	/*
+	 * Sequences implicitly use the heap AM, even though it's not explicitly
+	 * recorded in the catalogs. For other relation kinds, verify that the AM
+	 * is heap; otherwise, raise an error.
+	 */
+	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID &&
+		rel->rd_rel->relkind != RELKIND_SEQUENCE)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("only heap AM is supported")));
diff --git a/contrib/pgstattuple/sql/pgstattuple.sql b/contrib/pgstattuple/sql/pgstattuple.sql
index b08c31c21b..e4bf77eb32 100644
--- a/contrib/pgstattuple/sql/pgstattuple.sql
+++ b/contrib/pgstattuple/sql/pgstattuple.sql
@@ -119,6 +119,16 @@ create index test_partition_hash_idx on test_partition using hash (a);
 select pgstatindex('test_partition_idx');
 select pgstathashindex('test_partition_hash_idx');
 
+-- test on sequence
+-- only pgstattuple and pg_relpages should work
+create sequence serial;
+select count(*) from pgstattuple('serial');
+select pgstatindex('serial');
+select pgstatginindex('serial');
+select pgstathashindex('serial');
+select pg_relpages('serial');
+select count(*) from pgstattuple_approx('serial');
+
 drop table test_partitioned;
 drop view test_view;
 drop foreign table test_foreign_table;
-- 
2.41.0

#11Nathan Bossart
nathandbossart@gmail.com
In reply to: Ayush Vatsa (#10)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Fri, Aug 30, 2024 at 12:17:47AM +0530, Ayush Vatsa wrote:

The patch with all the changes is attached.

Looks generally reasonable to me.

--
nathan

#12Robert Haas
robertmhaas@gmail.com
In reply to: Nathan Bossart (#8)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Thu, Aug 29, 2024 at 1:36 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

Thanks. Robert, do you have any concerns with this?

I don't know if I'm exactly concerned but I don't understand what
problem we're solving, either. I thought Ayush said that the function
wouldn't produce useful results for sequences; so then why do we need
to change the code to enable it?

--
Robert Haas
EDB: http://www.enterprisedb.com

#13Nathan Bossart
nathandbossart@gmail.com
In reply to: Robert Haas (#12)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Fri, Aug 30, 2024 at 04:07:30PM -0400, Robert Haas wrote:

On Thu, Aug 29, 2024 at 1:36 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

Thanks. Robert, do you have any concerns with this?

I don't know if I'm exactly concerned but I don't understand what
problem we're solving, either. I thought Ayush said that the function
wouldn't produce useful results for sequences; so then why do we need
to change the code to enable it?

I suppose it would be difficult to argue that it is actually useful, given
it hasn't worked since v11 and apparently nobody noticed until recently.
If we're content to leave it unsupported, then sure, let's just remove the
"relkind == RELKIND_SEQUENCE" check in pgstat_relation(). But I also don't
have a great reason to _not_ support it. It used to work (which appears to
have been intentional, based on the code), it was unintentionally broken,
and it'd work again with a ~1 line change. "SELECT count(*) FROM
my_sequence" probably doesn't provide a lot of value, but I have no
intention of proposing a patch that removes support for that.

All that being said, I don't have a terribly strong opinion, but I guess I
lean towards re-enabling.

Another related inconsistency I just noticed in pageinspect:

postgres=# select t_data from heap_page_items(get_raw_page('s', 0));
t_data
--------------------------------------
\x0100000000000000000000000000000000
(1 row)

postgres=# select tuple_data_split('s'::regclass, t_data, t_infomask, t_infomask2, t_bits) from heap_page_items(get_raw_page('s', 0));
ERROR: only heap AM is supported

--
nathan

#14Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#13)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Fri, Aug 30, 2024 at 04:06:03PM -0500, Nathan Bossart wrote:

I suppose it would be difficult to argue that it is actually useful, given
it hasn't worked since v11 and apparently nobody noticed until recently.
If we're content to leave it unsupported, then sure, let's just remove the
"relkind == RELKIND_SEQUENCE" check in pgstat_relation(). But I also don't
have a great reason to _not_ support it. It used to work (which appears to
have been intentional, based on the code), it was unintentionally broken,
and it'd work again with a ~1 line change. "SELECT count(*) FROM
my_sequence" probably doesn't provide a lot of value, but I have no
intention of proposing a patch that removes support for that.

IMO, it can be useful to check the state of the page used by a
sequence. We have a few tweaks in sequence.c like manipulations of
t_infomask, and I can be good to check for its status on corrupted
systems.
--
Michael

#15Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#14)
1 attachment(s)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

Here is roughly what I had in mind to commit, but I'm not sure there's a
consensus on doing this.

--
nathan

Attachments:

v3-0001-Reintroduce-support-for-sequences-in-pgstattuple-.patchtext/plain; charset=us-asciiDownload
From 97e670e2e0b53a9dc91c5932ccb34bb1fd7eae6b Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathan@postgresql.org>
Date: Tue, 3 Sep 2024 14:49:24 -0500
Subject: [PATCH v3 1/1] Reintroduce support for sequences in pgstattuple and
 pageinspect.

Commit 4b82664156 restricted a number of functions provided by
contrib modules to only relations that use the "heap" table access
method.  Sequences always use this table access method, but they do
not advertise as such in the pg_class system catalog, so the
aforementioned commit also (presumably unintentionally) removed
support for sequences from these functions.  This commit
reintroduces said support for sequences and adds a couple of
relevant tests.

Co-authored-by: Ayush Vatsa
Reviewed-by: FIXME
Discussion: https://postgr.es/m/CACX%2BKaP3i%2Bi9tdPLjF5JCHVv93xobEdcd_eB%2B638VDvZ3i%3DcQA%40mail.gmail.com
Backpatch-through: 12
---
 contrib/pageinspect/expected/page.out        |  9 +++++++
 contrib/pageinspect/heapfuncs.c              |  6 ++++-
 contrib/pageinspect/sql/page.sql             |  5 ++++
 contrib/pgstattuple/expected/pgstattuple.out | 25 ++++++++++++++++++++
 contrib/pgstattuple/pgstattuple.c            |  6 ++++-
 contrib/pgstattuple/sql/pgstattuple.sql      | 12 ++++++++++
 6 files changed, 61 insertions(+), 2 deletions(-)

diff --git a/contrib/pageinspect/expected/page.out b/contrib/pageinspect/expected/page.out
index 80ddb45a60..04fd9dee4b 100644
--- a/contrib/pageinspect/expected/page.out
+++ b/contrib/pageinspect/expected/page.out
@@ -239,3 +239,12 @@ SELECT page_checksum(decode(repeat('00', :block_size), 'hex'), 1);
               
 (1 row)
 
+-- tests for sequences
+create temporary sequence test_sequence;
+select tuple_data_split('test_sequence'::regclass, t_data, t_infomask, t_infomask2, t_bits)
+  from heap_page_items(get_raw_page('test_sequence', 0));
+                   tuple_data_split                    
+-------------------------------------------------------
+ {"\\x0100000000000000","\\x0000000000000000","\\x00"}
+(1 row)
+
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 3faeabc711..38a539dad1 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -320,7 +320,11 @@ tuple_data_split_internal(Oid relid, char *tupdata,
 	raw_attrs = initArrayResult(BYTEAOID, CurrentMemoryContext, false);
 	nattrs = tupdesc->natts;
 
-	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+	/*
+	 * Sequences always use heap AM, but they don't show that in the catalogs.
+	 */
+	if (rel->rd_rel->relkind != RELKIND_SEQUENCE &&
+		rel->rd_rel->relam != HEAP_TABLE_AM_OID)
 		ereport(ERROR, (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 						errmsg("only heap AM is supported")));
 
diff --git a/contrib/pageinspect/sql/page.sql b/contrib/pageinspect/sql/page.sql
index 5bff568d3b..59784fc7cc 100644
--- a/contrib/pageinspect/sql/page.sql
+++ b/contrib/pageinspect/sql/page.sql
@@ -98,3 +98,8 @@ SHOW block_size \gset
 SELECT fsm_page_contents(decode(repeat('00', :block_size), 'hex'));
 SELECT page_header(decode(repeat('00', :block_size), 'hex'));
 SELECT page_checksum(decode(repeat('00', :block_size), 'hex'), 1);
+
+-- tests for sequences
+create temporary sequence test_sequence;
+select tuple_data_split('test_sequence'::regclass, t_data, t_infomask, t_infomask2, t_bits)
+  from heap_page_items(get_raw_page('test_sequence', 0));
diff --git a/contrib/pgstattuple/expected/pgstattuple.out b/contrib/pgstattuple/expected/pgstattuple.out
index 283856e109..9176dc98b6 100644
--- a/contrib/pgstattuple/expected/pgstattuple.out
+++ b/contrib/pgstattuple/expected/pgstattuple.out
@@ -273,6 +273,31 @@ select pgstathashindex('test_partition_hash_idx');
  (4,8,0,1,0,0,0,100)
 (1 row)
 
+-- these should work for sequences
+create sequence test_sequence;
+select count(*) from pgstattuple('test_sequence');
+ count 
+-------
+     1
+(1 row)
+
+select pg_relpages('test_sequence');
+ pg_relpages 
+-------------
+           1
+(1 row)
+
+-- these should fail for sequences
+select pgstatindex('test_sequence');
+ERROR:  relation "test_sequence" is not a btree index
+select pgstatginindex('test_sequence');
+ERROR:  relation "test_sequence" is not a GIN index
+select pgstathashindex('test_sequence');
+ERROR:  relation "test_sequence" is not a hash index
+select pgstattuple_approx('test_sequence');
+ERROR:  relation "test_sequence" is of wrong relation kind
+DETAIL:  This operation is not supported for sequences.
+drop sequence test_sequence;
 drop table test_partitioned;
 drop view test_view;
 drop foreign table test_foreign_table;
diff --git a/contrib/pgstattuple/pgstattuple.c b/contrib/pgstattuple/pgstattuple.c
index 3bd8b96197..1d012233cb 100644
--- a/contrib/pgstattuple/pgstattuple.c
+++ b/contrib/pgstattuple/pgstattuple.c
@@ -323,7 +323,11 @@ pgstat_heap(Relation rel, FunctionCallInfo fcinfo)
 	pgstattuple_type stat = {0};
 	SnapshotData SnapshotDirty;
 
-	if (rel->rd_rel->relam != HEAP_TABLE_AM_OID)
+	/*
+	 * Sequences always use heap AM, but they don't show that in the catalogs.
+	 */
+	if (rel->rd_rel->relkind != RELKIND_SEQUENCE &&
+		rel->rd_rel->relam != HEAP_TABLE_AM_OID)
 		ereport(ERROR,
 				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 				 errmsg("only heap AM is supported")));
diff --git a/contrib/pgstattuple/sql/pgstattuple.sql b/contrib/pgstattuple/sql/pgstattuple.sql
index b08c31c21b..7e72c567a0 100644
--- a/contrib/pgstattuple/sql/pgstattuple.sql
+++ b/contrib/pgstattuple/sql/pgstattuple.sql
@@ -119,6 +119,18 @@ create index test_partition_hash_idx on test_partition using hash (a);
 select pgstatindex('test_partition_idx');
 select pgstathashindex('test_partition_hash_idx');
 
+-- these should work for sequences
+create sequence test_sequence;
+select count(*) from pgstattuple('test_sequence');
+select pg_relpages('test_sequence');
+
+-- these should fail for sequences
+select pgstatindex('test_sequence');
+select pgstatginindex('test_sequence');
+select pgstathashindex('test_sequence');
+select pgstattuple_approx('test_sequence');
+
+drop sequence test_sequence;
 drop table test_partitioned;
 drop view test_view;
 drop foreign table test_foreign_table;
-- 
2.39.3 (Apple Git-146)

#16Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Nathan Bossart (#13)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Fri, 30 Aug 2024, 23:06 Nathan Bossart, <nathandbossart@gmail.com> wrote:

On Fri, Aug 30, 2024 at 04:07:30PM -0400, Robert Haas wrote:

On Thu, Aug 29, 2024 at 1:36 PM Nathan Bossart <nathandbossart@gmail.com> wrote:

Thanks. Robert, do you have any concerns with this?

I don't know if I'm exactly concerned but I don't understand what
problem we're solving, either. I thought Ayush said that the function
wouldn't produce useful results for sequences; so then why do we need
to change the code to enable it?

I suppose it would be difficult to argue that it is actually useful, given
it hasn't worked since v11 and apparently nobody noticed until recently.
If we're content to leave it unsupported, then sure, let's just remove the
"relkind == RELKIND_SEQUENCE" check in pgstat_relation(). But I also don't
have a great reason to _not_ support it. It used to work (which appears to
have been intentional, based on the code), it was unintentionally broken,
and it'd work again with a ~1 line change. "SELECT count(*) FROM
my_sequence" probably doesn't provide a lot of value, but I have no
intention of proposing a patch that removes support for that.

All that being said, I don't have a terribly strong opinion, but I guess I
lean towards re-enabling.

Another related inconsistency I just noticed in pageinspect:

postgres=# select t_data from heap_page_items(get_raw_page('s', 0));
t_data
--------------------------------------
\x0100000000000000000000000000000000
(1 row)

postgres=# select tuple_data_split('s'::regclass, t_data, t_infomask, t_infomask2, t_bits) from heap_page_items(get_raw_page('s', 0));
ERROR: only heap AM is supported

I don't think this is an inconsistency:

heap_page_items works on a raw page-as-bytea (produced by
get_raw_page) without knowing about or accessing the actual relation
type of that page, so it doesn't have the context why it should error
out if the page looks similar enough to a heap page. I could feed it
an arbitrary bytea, and it should still work as long as that bytea
looks similar enough to a heap page.
tuple_data_split, however, uses the regclass to decode the contents of
the tuple, and can thus determine with certainty based on that
regclass that it was supplied incorrect (non-heapAM table's regclass)
arguments. It therefore has enough context to bail out and stop trying
to decode the page's tuple data.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#17Nathan Bossart
nathandbossart@gmail.com
In reply to: Matthias van de Meent (#16)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Tue, Sep 03, 2024 at 10:19:33PM +0200, Matthias van de Meent wrote:

On Fri, 30 Aug 2024, 23:06 Nathan Bossart, <nathandbossart@gmail.com> wrote:

Another related inconsistency I just noticed in pageinspect:

postgres=# select t_data from heap_page_items(get_raw_page('s', 0));
t_data
--------------------------------------
\x0100000000000000000000000000000000
(1 row)

postgres=# select tuple_data_split('s'::regclass, t_data, t_infomask, t_infomask2, t_bits) from heap_page_items(get_raw_page('s', 0));
ERROR: only heap AM is supported

I don't think this is an inconsistency:

heap_page_items works on a raw page-as-bytea (produced by
get_raw_page) without knowing about or accessing the actual relation
type of that page, so it doesn't have the context why it should error
out if the page looks similar enough to a heap page. I could feed it
an arbitrary bytea, and it should still work as long as that bytea
looks similar enough to a heap page.
tuple_data_split, however, uses the regclass to decode the contents of
the tuple, and can thus determine with certainty based on that
regclass that it was supplied incorrect (non-heapAM table's regclass)
arguments. It therefore has enough context to bail out and stop trying
to decode the page's tuple data.

My point is really that tuple_data_split() needlessly ERRORs for sequences.
Other heap functions work fine for sequences, and we know it uses the heap
table AM, so why should tuple_data_split() fail? I agree that the others
needn't enforce relkind checks and that they might succeed in some cases
where tuple_data_split() might not be appropriate.

--
nathan

#18Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#17)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

Barring objections, I'm planning to commit v3 soon. Robert/Matthias, I'm
not sure you are convinced this is the right thing to do (or worth doing,
rather), but I don't sense that you are actually opposed to it, either.
Please correct me if I am wrong.

--
nathan

#19Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Nathan Bossart (#18)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Wed, 11 Sept 2024 at 21:36, Nathan Bossart <nathandbossart@gmail.com> wrote:

Barring objections, I'm planning to commit v3 soon. Robert/Matthias, I'm
not sure you are convinced this is the right thing to do (or worth doing,
rather), but I don't sense that you are actually opposed to it, either.
Please correct me if I am wrong.

Correct: I do think making heapam-related inspection functions have a
consistent behaviour when applied to sequences is beneficial, with no
preference towards specifically supporting or not supporting sequences
in these functions. If people that work on sequences think it's better
to also support inspection of sequences, then I think that's a good
reason to add that support where it doesn't already exist.

As for patch v3, that seems fine with me.

Matthias van de Meent
Neon (https://neon.tech)

#20Michael Paquier
michael@paquier.xyz
In reply to: Matthias van de Meent (#19)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Wed, Sep 11, 2024 at 11:02:43PM +0100, Matthias van de Meent wrote:

On Wed, 11 Sept 2024 at 21:36, Nathan Bossart <nathandbossart@gmail.com> wrote:

Barring objections, I'm planning to commit v3 soon. Robert/Matthias, I'm
not sure you are convinced this is the right thing to do (or worth doing,
rather), but I don't sense that you are actually opposed to it, either.
Please correct me if I am wrong.

Correct: I do think making heapam-related inspection functions have a
consistent behaviour when applied to sequences is beneficial, with no
preference towards specifically supporting or not supporting sequences
in these functions. If people that work on sequences think it's better
to also support inspection of sequences, then I think that's a good
reason to add that support where it doesn't already exist.

As for patch v3, that seems fine with me.

+1 from here as well, after looking at what v3 is doing for these two
modules.
--
Michael
#21Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#20)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Thu, Sep 12, 2024 at 11:19:00AM +0900, Michael Paquier wrote:

On Wed, Sep 11, 2024 at 11:02:43PM +0100, Matthias van de Meent wrote:

As for patch v3, that seems fine with me.

+1 from here as well, after looking at what v3 is doing for these two
modules.

Committed.

--
nathan

#22Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#21)
#23Michael Paquier
michael@paquier.xyz
In reply to: Nathan Bossart (#22)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Thu, Sep 12, 2024 at 07:41:30PM -0500, Nathan Bossart wrote:

On Thu, Sep 12, 2024 at 04:39:14PM -0500, Nathan Bossart wrote:

Committed.

Ugh, the buildfarm is unhappy with the new tests [0] [1]. Will fix.

I'd suggest to switch the test to return a count() and make sure that
one record exists. The data in the page does not really matter.
--
Michael

#24Nathan Bossart
nathandbossart@gmail.com
In reply to: Michael Paquier (#23)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Fri, Sep 13, 2024 at 09:47:36AM +0900, Michael Paquier wrote:

On Thu, Sep 12, 2024 at 07:41:30PM -0500, Nathan Bossart wrote:

Ugh, the buildfarm is unhappy with the new tests [0] [1]. Will fix.

I'd suggest to switch the test to return a count() and make sure that
one record exists. The data in the page does not really matter.

That's what I had in mind. I see that skimmer is failing with this error:

ERROR: cannot access temporary tables during a parallel operation

This makes sense because that machine has
debug_parallel_query/force_parallel_mode set to "regress", but this test
file has used a temporary table for a couple of years without issue...

--
nathan

#25Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#24)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Thu, Sep 12, 2024 at 07:56:56PM -0500, Nathan Bossart wrote:

I see that skimmer is failing with this error:

ERROR: cannot access temporary tables during a parallel operation

This makes sense because that machine has
debug_parallel_query/force_parallel_mode set to "regress", but this test
file has used a temporary table for a couple of years without issue...

Oh, the answer seems to be commits aeaaf52 and 47a22dc. In short, we can't
use a temporary sequence in this test for versions older than v15.

--
nathan

#26Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#25)
1 attachment(s)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Thu, Sep 12, 2024 at 08:42:09PM -0500, Nathan Bossart wrote:

Oh, the answer seems to be commits aeaaf52 and 47a22dc. In short, we can't
use a temporary sequence in this test for versions older than v15.

Here's a patch to make the sequence permanent and to make the output of
tuple_data_split() not depend on endianness.

--
nathan

Attachments:

fix_pageinspect_test.patchtext/plain; charset=us-asciiDownload
diff --git a/contrib/pageinspect/expected/page.out b/contrib/pageinspect/expected/page.out
index 04fd9dee4b..3fd3869c82 100644
--- a/contrib/pageinspect/expected/page.out
+++ b/contrib/pageinspect/expected/page.out
@@ -240,11 +240,12 @@ SELECT page_checksum(decode(repeat('00', :block_size), 'hex'), 1);
 (1 row)
 
 -- tests for sequences
-create temporary sequence test_sequence;
+create sequence test_sequence start 72057594037927937;
 select tuple_data_split('test_sequence'::regclass, t_data, t_infomask, t_infomask2, t_bits)
   from heap_page_items(get_raw_page('test_sequence', 0));
                    tuple_data_split                    
 -------------------------------------------------------
- {"\\x0100000000000000","\\x0000000000000000","\\x00"}
+ {"\\x0100000000000001","\\x0000000000000000","\\x00"}
 (1 row)
 
+drop sequence test_sequence;
diff --git a/contrib/pageinspect/sql/page.sql b/contrib/pageinspect/sql/page.sql
index 59784fc7cc..346e4ee142 100644
--- a/contrib/pageinspect/sql/page.sql
+++ b/contrib/pageinspect/sql/page.sql
@@ -100,6 +100,7 @@ SELECT page_header(decode(repeat('00', :block_size), 'hex'));
 SELECT page_checksum(decode(repeat('00', :block_size), 'hex'), 1);
 
 -- tests for sequences
-create temporary sequence test_sequence;
+create sequence test_sequence start 72057594037927937;
 select tuple_data_split('test_sequence'::regclass, t_data, t_infomask, t_infomask2, t_bits)
   from heap_page_items(get_raw_page('test_sequence', 0));
+drop sequence test_sequence;
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Nathan Bossart (#26)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

Nathan Bossart <nathandbossart@gmail.com> writes:

Here's a patch to make the sequence permanent and to make the output of
tuple_data_split() not depend on endianness.

+1 --- I checked this on mamba's host and it does produce
"\\x0100000000000001" regardless of endianness.

regards, tom lane

#28Nathan Bossart
nathandbossart@gmail.com
In reply to: Tom Lane (#27)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Thu, Sep 12, 2024 at 10:40:11PM -0400, Tom Lane wrote:

Nathan Bossart <nathandbossart@gmail.com> writes:

Here's a patch to make the sequence permanent and to make the output of
tuple_data_split() not depend on endianness.

+1 --- I checked this on mamba's host and it does produce
"\\x0100000000000001" regardless of endianness.

Thanks for checking. I'll commit this fix in the morning.

--
nathan

#29Nathan Bossart
nathandbossart@gmail.com
In reply to: Nathan Bossart (#28)
Re: Pgstattuple on Sequences: Seeking Community Feedback on Potential Patch

On Thu, Sep 12, 2024 at 10:41:27PM -0500, Nathan Bossart wrote:

Thanks for checking. I'll commit this fix in the morning.

Committed.

--
nathan