PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

Started by Abdoulaye Baover 1 year ago7 messages
#1Abdoulaye Ba
abdoulayeba29@gmail.com
1 attachment(s)

Hello PostgreSQL Hackers,

I am submitting a patch to add hooks for the functions
pg_total_relation_size and pg_indexes_size. These hooks allow for custom
behaviour to be injected into these functions, which can be useful for
extensions and other custom PostgreSQL modifications.

Patch details:

- Adds pg_total_relation_size_hook and pg_indexes_size_hook
- Modifies pg_total_relation_size and pg_indexes_size to call these
hooks if they are set
- Adds necessary type definitions and extern declarations

This feature is useful because it allows for more flexible and customizable
behaviour in relation size calculations, which can be particularly valuable
for extensions that need to account for additional storage outside of the
standard PostgreSQL mechanisms.

The patch is attached.

Thank you for considering this patch. I look forward to your feedback.

Kind regards,
Abdoulaye Ba

Attachments:

combined.patchtext/x-patch; charset=US-ASCII; name=combined.patchDownload
From 542dc052700097472180b114409603cc071c2e0d Mon Sep 17 00:00:00 2001
From: abi <abdoulayeba29@gmail.com>
Date: Thu, 8 Aug 2024 14:00:44 +0200
Subject: [PATCH 1/2] added hooks for total relation size and indexes size

---
 src/backend/utils/adt/dbsize.c | 11 ++++++++++-
 src/include/utils/rel.h        |  6 ++++++
 2 files changed, 16 insertions(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index b2d9cc2792..c820e06ca7 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -31,6 +31,9 @@
 #include "utils/relmapper.h"
 #include "utils/syscache.h"
 
+Pg_total_relation_size_hook_type pg_total_relation_size_hook = NULL;
+Pg_indexes_size_hook_type pg_indexes_size_hook = NULL;
+
 /* Divide by two and round away from zero */
 #define half_rounded(x)   (((x) + ((x) < 0 ? -1 : 1)) / 2)
 
@@ -513,6 +516,9 @@ pg_indexes_size(PG_FUNCTION_ARGS)
 	if (rel == NULL)
 		PG_RETURN_NULL();
 
+	if (pg_indexes_size_hook)
+        return (*pg_indexes_size_hook)(fcinfo);
+
 	size = calculate_indexes_size(rel);
 
 	relation_close(rel, AccessShareLock);
@@ -555,11 +561,14 @@ pg_total_relation_size(PG_FUNCTION_ARGS)
 	if (rel == NULL)
 		PG_RETURN_NULL();
 
+	if (pg_total_relation_size_hook)
+        return (*pg_total_relation_size_hook)(fcinfo);
+
 	size = calculate_total_relation_size(rel);
 
 	relation_close(rel, AccessShareLock);
 
-	PG_RETURN_INT64(size);
+	PG_RETURN_INT64(size + 1);
 }
 
 /*
diff --git a/src/include/utils/rel.h b/src/include/utils/rel.h
index 8700204953..7191dfd5a6 100644
--- a/src/include/utils/rel.h
+++ b/src/include/utils/rel.h
@@ -708,4 +708,10 @@ RelationCloseSmgr(Relation relation)
 extern void RelationIncrementReferenceCount(Relation rel);
 extern void RelationDecrementReferenceCount(Relation rel);
 
+typedef Datum (*Pg_total_relation_size_hook_type)(FunctionCallInfo fcinfo);
+typedef Datum (*Pg_indexes_size_hook_type)(FunctionCallInfo fcinfo);
+
+extern PGDLLIMPORT Pg_total_relation_size_hook_type pg_total_relation_size_hook;
+extern PGDLLIMPORT Pg_indexes_size_hook_type pg_indexes_size_hook;
+
 #endif							/* REL_H */
-- 
2.34.1

From 16050065e74722421b775afdb0342955e598fedd Mon Sep 17 00:00:00 2001
From: abi <abdoulayeba29@gmail.com>
Date: Thu, 8 Aug 2024 14:06:30 +0200
Subject: [PATCH 2/2] removed the -1

---
 src/backend/utils/adt/dbsize.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/utils/adt/dbsize.c b/src/backend/utils/adt/dbsize.c
index c820e06ca7..2e8835e302 100644
--- a/src/backend/utils/adt/dbsize.c
+++ b/src/backend/utils/adt/dbsize.c
@@ -568,7 +568,7 @@ pg_total_relation_size(PG_FUNCTION_ARGS)
 
 	relation_close(rel, AccessShareLock);
 
-	PG_RETURN_INT64(size + 1);
+	PG_RETURN_INT64(size);
 }
 
 /*
-- 
2.34.1

#2Tomas Vondra
tomas@vondra.me
In reply to: Abdoulaye Ba (#1)
Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

On 8/8/24 14:18, Abdoulaye Ba wrote:

Hello PostgreSQL Hackers,

I am submitting a patch to add hooks for the functions
pg_total_relation_size and pg_indexes_size. These hooks allow for custom
behaviour to be injected into these functions, which can be useful for
extensions and other custom PostgreSQL modifications.

Patch details: 

* Adds pg_total_relation_size_hook and pg_indexes_size_hook 
* Modifies pg_total_relation_size and pg_indexes_size to call these
hooks if they are set 
* Adds necessary type definitions and extern declarations

This feature is useful because it allows for more flexible and
customizable behaviour in relation size calculations, which can be
particularly valuable for extensions that need to account for additional
storage outside of the standard PostgreSQL mechanisms.

The patch is attached. 

Thank you for considering this patch. I look forward to your feedback.

Hi,

Thanks for the patch. A couple comments:

1) Unfortunately, it doesn't compile - it fails with errors like this:

In file included from ../../../../src/include/access/tableam.h:25,
from detoast.c:18:
../../../../src/include/utils/rel.h:711:51: error: unknown type name
‘FunctionCallInfo’
711 | typedef Datum
(*Pg_total_relation_size_hook_type)(FunctionCallInfo fcinfo);

which happens because rel.h references FunctionCallInfo without
including fmgr.h. I wonder how you tested the patch ...

2) Also, I'm not sure why you have the "Pg_" at the beginning.

3) I'm not sure why the patch first changes the return to add +1 and
then undoes that:

PG_RETURN_INT64(size + 1);

That seems quite unnecessary. I wonder how you created the patch, seems
you just concatenated multiple patches.

4) The patch has a mix of tabs and spaces. We don't do that.

5) It would be really helpful if you could explain the motivation for
this. Not just "make this customizable" but what exactly you want to do
in the hooks and why (presumably you have an extension).

6) Isn't it a bit strange the patch modifies pg_total_relation_size()
but not pg_relation_size() or pg_table_size()? Am I missing something?

7) You should add the patch to the next commitfest (2024-09) at

https://commitfest.postgresql.org

regards

--
Tomas Vondra

#3Abdoulaye Ba
abdoulayeba29@gmail.com
In reply to: Tomas Vondra (#2)
Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

On Thu, 8 Aug 2024 at 23:29, Tomas Vondra <tomas@vondra.me> wrote:

On 8/8/24 14:18, Abdoulaye Ba wrote:

Hello PostgreSQL Hackers,

I am submitting a patch to add hooks for the functions
pg_total_relation_size and pg_indexes_size. These hooks allow for custom
behaviour to be injected into these functions, which can be useful for
extensions and other custom PostgreSQL modifications.

Patch details:

* Adds pg_total_relation_size_hook and pg_indexes_size_hook
* Modifies pg_total_relation_size and pg_indexes_size to call these
hooks if they are set
* Adds necessary type definitions and extern declarations

This feature is useful because it allows for more flexible and
customizable behaviour in relation size calculations, which can be
particularly valuable for extensions that need to account for additional
storage outside of the standard PostgreSQL mechanisms.

The patch is attached.

Thank you for considering this patch. I look forward to your feedback.

Hi,

Thanks for the patch. A couple comments:

1) Unfortunately, it doesn't compile - it fails with errors like this:

In file included from ../../../../src/include/access/tableam.h:25,
from detoast.c:18:
../../../../src/include/utils/rel.h:711:51: error: unknown type name
‘FunctionCallInfo’
711 | typedef Datum
(*Pg_total_relation_size_hook_type)(FunctionCallInfo fcinfo);

which happens because rel.h references FunctionCallInfo without
including fmgr.h. I wonder how you tested the patch ...

2) Also, I'm not sure why you have the "Pg_" at the beginning.

3) I'm not sure why the patch first changes the return to add +1 and
then undoes that:

PG_RETURN_INT64(size + 1);

That seems quite unnecessary. I wonder how you created the patch, seems
you just concatenated multiple patches.

4) The patch has a mix of tabs and spaces. We don't do that.

5) It would be really helpful if you could explain the motivation for
this. Not just "make this customizable" but what exactly you want to do
in the hooks and why (presumably you have an extension).

6) Isn't it a bit strange the patch modifies pg_total_relation_size()
but not pg_relation_size() or pg_table_size()? Am I missing something?

7) You should add the patch to the next commitfest (2024-09) at

https://commitfest.postgresql.org

regards

Hi all,

Here's my follow-up based on the feedback received:

Show quoted text

1.

*Compilation Issue:*
I didn’t encounter any errors when compiling on my machine, but I’ll
review the environment and ensure fmgr.h is included where necessary
to avoid the issue.
2.

*Prefix "Pg_":*
I’ll remove the "Pg_" prefix as I see now that it’s unnecessary.
3.

*Return Value Change:*
The +1 in the return value was part of a test that I forgot to remove.
I’ll clean that up in the next version of the patch.
4.

*Tabs and Spaces:*
I’ll correct the indentation to use tabs consistently, as per the
project’s coding standards.
5.

*Motivation:*
The hooks are intended to add support for calculating the Tantivy
index size, in line with the need outlined in this issue
<https://github.com/paradedb/paradedb/issues/1061&gt;. This will allow us
to integrate additional index sizes into PostgreSQL’s built-in size
functions.
6.

*Additional Hooks:*
I’ll add hooks for pg_relation_size() and pg_table_size() for
consistency.

I’ll make these changes and resubmit the patch soon. Thanks again for your
guidance!

Best regards,

#4Andreas Karlsson
andreas@proxel.se
In reply to: Abdoulaye Ba (#1)
Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

On 8/8/24 2:18 PM, Abdoulaye Ba wrote:

I am submitting a patch to add hooks for the functions
pg_total_relation_size and pg_indexes_size. These hooks allow for custom
behaviour to be injected into these functions, which can be useful for
extensions and other custom PostgreSQL modifications.

What kind of extensions do you imagine would use this hook? If it is for
custom index AMs I think adding this to the index AM interface would
make more sense than just adding a generic callback hook.

Andreas

#5Abdoulaye Ba
abdoulayeba29@gmail.com
In reply to: Andreas Karlsson (#4)
Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

On Fri, 9 Aug 2024 at 18:10, Andreas Karlsson <andreas@proxel.se> wrote:

Show quoted text

On 8/8/24 2:18 PM, Abdoulaye Ba wrote:

I am submitting a patch to add hooks for the functions
pg_total_relation_size and pg_indexes_size. These hooks allow for custom
behaviour to be injected into these functions, which can be useful for
extensions and other custom PostgreSQL modifications.

What kind of extensions do you imagine would use this hook? If it is for
custom index AMs I think adding this to the index AM interface would
make more sense than just adding a generic callback hook.

Andreas

The primary use case for this hook is to allow extensions to account for
additional storage mechanisms that are not covered by the default
PostgreSQL relation size calculations. For instance, in our project, we are
working with an external indexing system (Tantivy) that maintains
additional data structures outside the standard PostgreSQL storage. This
hook allows us to include the size of these additional structures in the
total relation size calculations.

While I understand your suggestion about custom index AMs, the intent
behind this hook is broader. It is not limited to custom index types but
can also be used for other forms of external storage that need to be
accounted for in relation size calculations. This is why a generic callback
hook was chosen over extending the index AM interface.

However, if there is a consensus that such a hook would be better suited
within the index AM interface for cases involving custom index storage, I'm
open to discussing this further and exploring how it could be integrated
more tightly with the existing PostgreSQL AM framework.

#6Andreas Karlsson
andreas@proxel.se
In reply to: Abdoulaye Ba (#5)
Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

On 8/9/24 6:59 PM, Abdoulaye Ba wrote:> The primary use case for
this hook is to allow extensions to account

for additional storage mechanisms that are not covered by the
default PostgreSQL relation size calculations. For instance, in our
project, we are working with an external indexing system (Tantivy)
that maintains additional data structures outside the standard
PostgreSQL storage. This hook allows us to include the size of these
additional structures in the total relation size calculations.

While I understand your suggestion about custom index AMs, the
intent behind this hook is broader. It is not limited to custom
index types but can also be used for other forms of external storage
that need to be accounted for in relation size calculations. This is
why a generic callback hook was chosen over extending the index AM
interface.

However, if there is a consensus that such a hook would be better
suited within the index AM interface for cases involving custom
index storage, I'm open to discussing this further and exploring how
it could be integrated more tightly with the existing PostgreSQL AM
framework.

Yeah, I strongly suspected it was ParadeDB. :)

I am only one developer but I really do not like solving this with a
hook, instead I think the proper solution is to integrate this properly
with custom AMs and storage managers. I think we should do it properly
or not at all.

Andreas

#7Tomas Vondra
tomas@vondra.me
In reply to: Andreas Karlsson (#6)
Re: PATCH: Add hooks for pg_total_relation_size and pg_indexes_size

On 8/28/24 17:53, Andreas Karlsson wrote:

On 8/9/24 6:59 PM, Abdoulaye Ba wrote:>     The primary use case for
this hook is to allow extensions to account

    for additional storage mechanisms that are not covered by the
    default PostgreSQL relation size calculations. For instance, in our
    project, we are working with an external indexing system (Tantivy)
    that maintains additional data structures outside the standard
    PostgreSQL storage. This hook allows us to include the size of these
    additional structures in the total relation size calculations.

    While I understand your suggestion about custom index AMs, the
    intent behind this hook is broader. It is not limited to custom
    index types but can also be used for other forms of external storage
    that need to be accounted for in relation size calculations. This is
    why a generic callback hook was chosen over extending the index AM
    interface.

    However, if there is a consensus that such a hook would be better
    suited within the index AM interface for cases involving custom
    index storage, I'm open to discussing this further and exploring how
    it could be integrated more tightly with the existing PostgreSQL AM
    framework.

Yeah, I strongly suspected it was ParadeDB. :)

I am only one developer but I really do not like solving this with a
hook, instead I think the proper solution is to integrate this properly
with custom AMs and storage managers. I think we should do it properly
or not at all.

Not sure. I'd agree if the index was something that could be implemented
through index AM - then that'd be the way to go. It might require some
improvements to the index AM to use the correct index size, haven't checked.

But it seems pg_search (which AFAIK is what paradedb uses to integrate
tantivy indexes) uses the term "index" for something very different. I'm
not sure that's something that could be conveniently implemented as
index AM, but I haven't checked. But that just raises the question why
should that be included in pg_total_relation_size and pg_indexes_size at
all, if it's not what we'd call an index.

regards

--
Tomas Vondra