Allow pageinspect's bt_page_stats function to return a set of rows instead of a single row
Hello Hackers,
While working on one of my blogs on the B-Tree indexes
<https://www.percona.com/blog/postgresql-14-b-tree-index-reduced-bloat-with-bottom-up-deletion/>,
I needed to look at a range of B-Tree page statistics. So the goto solution
was to use pageinspect. However, reviewing stats for multiple pages meant
issuing multiple queries. I felt that there's an opportunity for
improvement in the extension by extending the API to output the statistics
for multiple pages with a single query.
That attached patch is based on the master branch. It makes the following
changes to the pageinspect contrib module:
- Updates bt_page_stats_internal function to accept 3 arguments instead of
2.
- The function now uses SRF macros to return a set rather than a single
row. The function call now requires specifying column names.
The extension version is bumped to 1.11 (PAGEINSPECT_V1_11).
To maintain backward compatibility, for versions below 1.11, the multi-call
mechanism is ended to keep the old behavior consistent.
Regression test cases for the module are updated as well as part of this
change. Here is a subset of queries that are added to the btree.sql test
case file for pageinspect.
----
CREATE TABLE test2 AS (SELECT generate_series(1, 5000) AS col1);
CREATE INDEX test2_col1_idx ON test2(col1);
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 2);
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 0);
SELECT * FROM bt_page_stats('test2_col1_idx', 0, 1);
SELECT * FROM bt_page_stats('test2_col1_idx', 1, -1);
DROP TABLE test2;
----
Regards.
--
Hamid Akhtar,
Percona LLC,
URL : www.percona.com
CELL:+923335449950
SKYPE: engineeredvirus
Attachments:
pageinspect_btree_pagestats_01.patchapplication/x-patch; name=pageinspect_btree_pagestats_01.patchDownload+326-103
Hi,
On 6/27/22 9:31 AM, Hamid Akhtar wrote:
Hello Hackers,
While working on one of my blogs on the B-Tree indexes
<https://www.percona.com/blog/postgresql-14-b-tree-index-reduced-bloat-with-bottom-up-deletion/>,
I needed to look at a range of B-Tree page statistics. So the goto
solution was to use pageinspect. However, reviewing stats for multiple
pages meant issuing multiple queries.
FWIW, I think you could also rely on generate_series()
I felt that there's an opportunity for improvement in the extension by
extending the API to output the statistics for multiple pages with a
single query.That attached patch is based on the master branch. It makes the
following changes to the pageinspect contrib module:
- Updates bt_page_stats_internal function to accept 3 arguments
instead of 2.
- The function now uses SRF macros to return a set rather than a
single row. The function call now requires specifying column names.The extension version is bumped to 1.11 (PAGEINSPECT_V1_11).
To maintain backward compatibility, for versions below 1.11, the
multi-call mechanism is ended to keep the old behavior consistent.Regression test cases for the module are updated as well as part of
this change. Here is a subset of queries that are added to the
btree.sql test case file for pageinspect.----
CREATE TABLE test2 AS (SELECT generate_series(1, 5000) AS col1);
CREATE INDEX test2_col1_idx ON test2(col1);
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 2);
For example, this could be written as:
select * from
generate_series(1, 2) blkno ,
bt_page_stats('test2_col1_idx',blkno::int);
Or, if one wants to inspect to whole relation, something like:
select * from
generate_series(1, pg_relation_size('test2_col1_idx'::regclass::text) /
8192 - 1) blkno ,
bt_page_stats('test2_col1_idx',blkno::int);
Regards,
Bertrand
On Mon, Jun 27, 2022 at 1:40 PM Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
Hi,
On 6/27/22 9:31 AM, Hamid Akhtar wrote:
Hello Hackers,
While working on one of my blogs on the B-Tree indexes, I needed to look at a range of B-Tree page statistics. So the goto solution was to use pageinspect. However, reviewing stats for multiple pages meant issuing multiple queries.
+1 to improve the API.
I felt that there's an opportunity for improvement in the extension by extending the API to output the statistics for multiple pages with a single query.
That attached patch is based on the master branch. It makes the following changes to the pageinspect contrib module:
- Updates bt_page_stats_internal function to accept 3 arguments instead of 2.
- The function now uses SRF macros to return a set rather than a single row. The function call now requires specifying column names.The extension version is bumped to 1.11 (PAGEINSPECT_V1_11).
To maintain backward compatibility, for versions below 1.11, the multi-call mechanism is ended to keep the old behavior consistent.Regression test cases for the module are updated as well as part of this change. Here is a subset of queries that are added to the btree.sql test case file for pageinspect.
----
CREATE TABLE test2 AS (SELECT generate_series(1, 5000) AS col1);
CREATE INDEX test2_col1_idx ON test2(col1);
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 2);For example, this could be written as:
select * from
generate_series(1, 2) blkno ,
bt_page_stats('test2_col1_idx',blkno::int);Or, if one wants to inspect to whole relation, something like:
select * from
generate_series(1, pg_relation_size('test2_col1_idx'::regclass::text) / 8192 - 1) blkno ,
bt_page_stats('test2_col1_idx',blkno::int);
Good one. But not all may know the alternatives. Do we have any
difference in the execution times for the above query vs the new
function introduced in the v1 patch? If there's not much difference, I
would suggest adding an SQL function around the generate_series
approach in the pageinspect extension for better and easier usability.
Regards,
Bharath Rupireddy.
On Mon, 27 Jun 2022 at 15:52, Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:
On Mon, Jun 27, 2022 at 1:40 PM Drouvot, Bertrand <bdrouvot@amazon.com>
wrote:Hi,
On 6/27/22 9:31 AM, Hamid Akhtar wrote:
Hello Hackers,
While working on one of my blogs on the B-Tree indexes, I needed to look
at a range of B-Tree page statistics. So the goto solution was to use
pageinspect. However, reviewing stats for multiple pages meant issuing
multiple queries.+1 to improve the API.
I felt that there's an opportunity for improvement in the extension by
extending the API to output the statistics for multiple pages with a single
query.That attached patch is based on the master branch. It makes the
following changes to the pageinspect contrib module:
- Updates bt_page_stats_internal function to accept 3 arguments instead
of 2.
- The function now uses SRF macros to return a set rather than a single
row. The function call now requires specifying column names.
The extension version is bumped to 1.11 (PAGEINSPECT_V1_11).
To maintain backward compatibility, for versions below 1.11, themulti-call mechanism is ended to keep the old behavior consistent.
Regression test cases for the module are updated as well as part of this
change. Here is a subset of queries that are added to the btree.sql test
case file for pageinspect.----
CREATE TABLE test2 AS (SELECT generate_series(1, 5000) AS col1);
CREATE INDEX test2_col1_idx ON test2(col1);
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 2);For example, this could be written as:
select * from
generate_series(1, 2) blkno ,
bt_page_stats('test2_col1_idx',blkno::int);Or, if one wants to inspect to whole relation, something like:
select * from
generate_series(1, pg_relation_size('test2_col1_idx'::regclass::text) /8192 - 1) blkno ,
bt_page_stats('test2_col1_idx',blkno::int);
Good one. But not all may know the alternatives.
+1
Do we have any
difference in the execution times for the above query vs the new
function introduced in the v1 patch? If there's not much difference, I
would suggest adding an SQL function around the generate_series
approach in the pageinspect extension for better and easier usability.
Based on some basic SQL execution time comparison of the two approaches, I
see that the API change, on average, is around 40% faster than the SQL.
CREATE TABLE test2 AS (SELECT generate_series(1, 5000000) AS col1);
CREATE INDEX test2_col1_idx ON test2(col1);
EXPLAIN ANALYZE
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 5000);
EXPLAIN ANALYZE
SELECT * FROM GENERATE_SERIES(1, 5000) blkno,
bt_page_stats('test2_col1_idx',blkno::int);
For me, the API change returns back the data in around 74ms whereas the SQL
returns it in 102ms. So considering this and as you mentioned, the
alternative may not be that obvious to everyone, it is a fair improvement.
Show quoted text
Regards,
Bharath Rupireddy.
On Thu, Jun 30, 2022 at 1:54 PM Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
Do we have any
difference in the execution times for the above query vs the new
function introduced in the v1 patch? If there's not much difference, I
would suggest adding an SQL function around the generate_series
approach in the pageinspect extension for better and easier usability.Based on some basic SQL execution time comparison of the two approaches, I see that the API change, on average, is around 40% faster than the SQL.
CREATE TABLE test2 AS (SELECT generate_series(1, 5000000) AS col1);
CREATE INDEX test2_col1_idx ON test2(col1);EXPLAIN ANALYZE
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 5000);EXPLAIN ANALYZE
SELECT * FROM GENERATE_SERIES(1, 5000) blkno, bt_page_stats('test2_col1_idx',blkno::int);For me, the API change returns back the data in around 74ms whereas the SQL returns it in 102ms. So considering this and as you mentioned, the alternative may not be that obvious to everyone, it is a fair improvement.
I'm wondering what happens with a bit of huge data and different test
cases each test case executed, say, 2 or 3 times.
If the difference in execution times is always present, then the API
approach or changing the core function would make more sense.
Regards,
Bharath Rupireddy.
On Thu, 30 Jun 2022 at 14:27, Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:
On Thu, Jun 30, 2022 at 1:54 PM Hamid Akhtar <hamid.akhtar@gmail.com>
wrote:Do we have any
difference in the execution times for the above query vs the new
function introduced in the v1 patch? If there's not much difference, I
would suggest adding an SQL function around the generate_series
approach in the pageinspect extension for better and easier usability.Based on some basic SQL execution time comparison of the two approaches,
I see that the API change, on average, is around 40% faster than the SQL.
CREATE TABLE test2 AS (SELECT generate_series(1, 5000000) AS col1);
CREATE INDEX test2_col1_idx ON test2(col1);EXPLAIN ANALYZE
SELECT * FROM bt_page_stats('test2_col1_idx', 1, 5000);EXPLAIN ANALYZE
SELECT * FROM GENERATE_SERIES(1, 5000) blkno,bt_page_stats('test2_col1_idx',blkno::int);
For me, the API change returns back the data in around 74ms whereas the
SQL returns it in 102ms. So considering this and as you mentioned, the
alternative may not be that obvious to everyone, it is a fair improvement.I'm wondering what happens with a bit of huge data and different test
cases each test case executed, say, 2 or 3 times.If the difference in execution times is always present, then the API
approach or changing the core function would make more sense.
Technically, AFAIK, the performance difference will always be there.
Firstly, in the API change, there is no additional overhead of the
generate_series function. Additionally, with API change, looping over the
pages has a smaller overhead when compared with the overhead of the SQL
approach.
Show quoted text
Regards,
Bharath Rupireddy.
Hi,
On 6/30/22 10:24 AM, Hamid Akhtar wrote:
On Mon, 27 Jun 2022 at 15:52, Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:On Mon, Jun 27, 2022 at 1:40 PM Drouvot, Bertrand
<bdrouvot@amazon.com> wrote:Hi,
On 6/27/22 9:31 AM, Hamid Akhtar wrote:
Hello Hackers,
While working on one of my blogs on the B-Tree indexes, I needed
to look at a range of B-Tree page statistics. So the goto solution
was to use pageinspect. However, reviewing stats for multiple
pages meant issuing multiple queries.+1 to improve the API.
I think it makes sense too.
But what about the other pageinspect's functions that also use a single
blkno as parameter? Should not the patch also takes care of them?
Regards,
Bertrand
On Fri, 1 Jul 2022 at 13:01, Drouvot, Bertrand <bdrouvot@amazon.com> wrote:
Hi,
On 6/30/22 10:24 AM, Hamid Akhtar wrote:On Mon, 27 Jun 2022 at 15:52, Bharath Rupireddy <
bharath.rupireddyforpostgres@gmail.com> wrote:On Mon, Jun 27, 2022 at 1:40 PM Drouvot, Bertrand <bdrouvot@amazon.com>
wrote:Hi,
On 6/27/22 9:31 AM, Hamid Akhtar wrote:
Hello Hackers,
While working on one of my blogs on the B-Tree indexes, I needed to
look at a range of B-Tree page statistics. So the goto solution was to use
pageinspect. However, reviewing stats for multiple pages meant issuing
multiple queries.+1 to improve the API.
I think it makes sense too.
But what about the other pageinspect's functions that also use a single
blkno as parameter? Should not the patch also takes care of them?I've started working on that. But it's going to be a much bigger change
with a lot of code refactoring.
So, taking this one step at a time, IMHO, this patch is good to be reviewed
now.
Show quoted text
Regards,
Bertrand
Hamid Akhtar <hamid.akhtar@gmail.com> writes:
That attached patch is based on the master branch. It makes the following
changes to the pageinspect contrib module:
- Updates bt_page_stats_internal function to accept 3 arguments instead of
2.
- The function now uses SRF macros to return a set rather than a single
row. The function call now requires specifying column names.
FWIW, I think you'd be way better off changing the function name, say
to bt_multi_page_stats(). Overloading the name this way is going to
lead to great confusion, e.g. somebody who fat-fingers the number of
output arguments in a JDBC call could see confusing results due to
invoking the wrong one of the two functions. Also, I'm not quite sure
what you mean by "The function call now requires specifying column
names", but it doesn't sound like an acceptable restriction from a
compatibility standpoint. If a different name dodges that issue then
it's clearly a better way.
regards, tom lane
On Thu, 28 Jul 2022 at 00:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hamid Akhtar <hamid.akhtar@gmail.com> writes:
That attached patch is based on the master branch. It makes the following
changes to the pageinspect contrib module:
- Updates bt_page_stats_internal function to accept 3 arguments insteadof
2.
- The function now uses SRF macros to return a set rather than a single
row. The function call now requires specifying column names.FWIW, I think you'd be way better off changing the function name, say
to bt_multi_page_stats(). Overloading the name this way is going to
lead to great confusion, e.g. somebody who fat-fingers the number of
output arguments in a JDBC call could see confusing results due to
invoking the wrong one of the two functions. Also, I'm not quite sure
what you mean by "The function call now requires specifying column
names", but it doesn't sound like an acceptable restriction from a
compatibility standpoint. If a different name dodges that issue then
it's clearly a better way.regards, tom lane
Attached please find the latest version of the patch;
pageinspect_btree_multipagestats_02.patch.
It no longer modifies the existing bt_page_stats function. Instead, it
introduces a new function bt_multi_page_stats as you had suggested. The
function expects three arguments where the first argument is the index name
followed by block number and number of blocks to be returned.
Please ignore this statement. It was a typo.
"The function call now requires specifying column names"
Attachments:
pageinspect_btree_multipagestats_02.patchapplication/x-patch; name=pageinspect_btree_multipagestats_02.patchDownload+343-76
On Sun, 31 Jul 2022 at 18:00, Hamid Akhtar <hamid.akhtar@gmail.com> wrote:
On Thu, 28 Jul 2022 at 00:36, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hamid Akhtar <hamid.akhtar@gmail.com> writes:
That attached patch is based on the master branch. It makes the
following
changes to the pageinspect contrib module:
- Updates bt_page_stats_internal function to accept 3 arguments insteadof
2.
- The function now uses SRF macros to return a set rather than a single
row. The function call now requires specifying column names.FWIW, I think you'd be way better off changing the function name, say
to bt_multi_page_stats(). Overloading the name this way is going to
lead to great confusion, e.g. somebody who fat-fingers the number of
output arguments in a JDBC call could see confusing results due to
invoking the wrong one of the two functions. Also, I'm not quite sure
what you mean by "The function call now requires specifying column
names", but it doesn't sound like an acceptable restriction from a
compatibility standpoint. If a different name dodges that issue then
it's clearly a better way.regards, tom lane
Attached please find the latest version of the patch;
pageinspect_btree_multipagestats_02.patch.It no longer modifies the existing bt_page_stats function. Instead, it
introduces a new function bt_multi_page_stats as you had suggested. The
function expects three arguments where the first argument is the index name
followed by block number and number of blocks to be returned.Please ignore this statement. It was a typo.
"The function call now requires specifying column names"
Attached is the rebased version of the patch
(pageinspect_btree_multipagestats_03.patch) for the master branch.
Show quoted text
Attachments:
pageinspect_btree_multipagestats_03.patchapplication/x-patch; name=pageinspect_btree_multipagestats_03.patchDownload+343-76
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested
Looks good to me.
The new status of this patch is: Ready for Committer
On Mon, Aug 1, 2022 at 11:29 PM Naeem Akhter <naeem.akhter@percona.com>
wrote:
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not testedLooks good to me.
The new status of this patch is: Ready for Committer
The patch has a compilation error on the latest code base, please rebase
your patch.
[03:08:46.087] /tmp/cirrus-ci-build/contrib/cube/cubeparse.h:77:27: error:
‘result’ was not declared in this scope
[03:08:46.087] 77 | int cube_yyparse (NDBOX **result, Size scanbuflen);
[03:08:46.087] | ^~~~~~
[03:08:46.087] /tmp/cirrus-ci-build/contrib/cube/cubeparse.h:77:40: error:
expected primary-expression before ‘scanbuflen’
[03:08:46.087] 77 | int cube_yyparse (NDBOX **result, Size scanbuflen);
...
--
Ibrar Ahmed
On Tue, 6 Sept 2022 at 11:25, Ibrar Ahmed <ibrar.ahmad@gmail.com> wrote:
On Mon, Aug 1, 2022 at 11:29 PM Naeem Akhter <naeem.akhter@percona.com>
wrote:The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not testedLooks good to me.
The new status of this patch is: Ready for Committer
The patch has a compilation error on the latest code base, please rebase
your patch.[03:08:46.087] /tmp/cirrus-ci-build/contrib/cube/cubeparse.h:77:27: error:
‘result’ was not declared in this scope
[03:08:46.087] 77 | int cube_yyparse (NDBOX **result, Size scanbuflen);
[03:08:46.087] | ^~~~~~
[03:08:46.087] /tmp/cirrus-ci-build/contrib/cube/cubeparse.h:77:40: error:
expected primary-expression before ‘scanbuflen’
[03:08:46.087] 77 | int cube_yyparse (NDBOX **result, Size scanbuflen);
...--
Ibrar Ahmed
The compilation and regression are working fine. I have verified it against
the tip of the master branch [commit: 57796a0f].
Hamid Akhtar <hamid.akhtar@gmail.com> writes:
Attached is the rebased version of the patch
(pageinspect_btree_multipagestats_03.patch) for the master branch.
I looked through this and cleaned up the code a little (attached).
There are still some issues before it could be considered committable:
1. Where's the documentation update?
2. As this stands, there's no nice way to ask for "all the pages".
If you specify a page count that's even one too large, you get an
error. I think there's room for an easier-to-use way to do that.
We could say that the thing just silently stops at the last page,
so that you just need to write a large page count. Or maybe it'd
be better to define a zero or negative page count as "all the rest",
while still insisting that a positive count refer to real pages.
3. I think it's highly likely that the new test case is not portable.
In particular a machine with MAXALIGN 4 would be likely to put a
different number of tuples per page, or do the page split differently
so that the page with fewer index tuples isn't page 3. Unfortunately
I don't seem to have a working setup like that right at the moment
to verify; but I'd counsel trying this inside a VM or something to
see if it's actually likely to survive on the buildfarm. I'm not
sure, but making the indexed column be int8 instead of int4 might
reduce the risks here.
regards, tom lane
Attachments:
pageinspect_btree_multipagestats-v4.patchtext/x-diff; charset=us-ascii; name=pageinspect_btree_multipagestats-v4.patchDownload+342-67
I wrote:
3. I think it's highly likely that the new test case is not portable.
In particular a machine with MAXALIGN 4 would be likely to put a
different number of tuples per page, or do the page split differently
so that the page with fewer index tuples isn't page 3. Unfortunately
I don't seem to have a working setup like that right at the moment
to verify; but I'd counsel trying this inside a VM or something to
see if it's actually likely to survive on the buildfarm.
I spun up a 32-bit VM, since that had been on my to-do list anyway,
and it looks like I was right:
diff -U3 /usr/home/tgl/pgsql/contrib/pageinspect/expected/btree.out /usr/home/tgl/pgsql/contrib/pageinspect/results/btree.out
--- /usr/home/tgl/pgsql/contrib/pageinspect/expected/btree.out 2022-09-12 15:15:40.432135000 -0400
+++ /usr/home/tgl/pgsql/contrib/pageinspect/results/btree.out 2022-09-12 15:15:54.481549000 -0400
@@ -49,11 +49,11 @@
-[ RECORD 1 ]-+-----
blkno | 1
type | l
-live_items | 367
+live_items | 458
dead_items | 0
-avg_item_size | 16
+avg_item_size | 12
page_size | 8192
-free_size | 808
+free_size | 820
btpo_prev | 0
btpo_next | 2
btpo_level | 0
@@ -61,11 +61,11 @@
... etc etc ...
regards, tom lane
On Mon, Sep 12, 2022 at 03:18:59PM -0400, Tom Lane wrote:
I spun up a 32-bit VM, since that had been on my to-do list anyway,
and it looks like I was right:
This feedback has not been addressed and the thread is idle four
weeks, so I have marked this CF entry as RwF. Please feel free to
resubmit once a new version of the patch is available.
--
Michael
On Wed, 12 Oct 2022 at 10:51, Michael Paquier <michael@paquier.xyz> wrote:
On Mon, Sep 12, 2022 at 03:18:59PM -0400, Tom Lane wrote:
I spun up a 32-bit VM, since that had been on my to-do list anyway,
and it looks like I was right:This feedback has not been addressed and the thread is idle four
weeks, so I have marked this CF entry as RwF. Please feel free to
resubmit once a new version of the patch is available.
Attaching the version 5 of the patch that addresses all 3 points raised by
Tom Lane earlier in the thread.
(1) Documentation is added.
(2) Passing "-1" for the number of blocks required now returns all the
remaining index pages after the starting block.
(3) The newly added test cases work for both 32-bit and 64-bit systems.
Show quoted text
--
Michael
Attachments:
pageinspect_btree_multipagestats-v5.patchapplication/octet-stream; name=pageinspect_btree_multipagestats-v5.patchDownload+406-70
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested
i've tested and verified the documentation.
On Mon, 21 Nov 2022 at 17:34, Naeem Akhter <naeem.akhter@percona.com> wrote:
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not testedi've tested and verified the documentation.
Rebasing the patch to the tip of the master branch.