[Proposal] Adding callback support for custom statistics kinds

Started by Sami Imseihabout 2 months ago35 messages

Threaded Flat

Oldest First Newest First

Sami Imseih

samimseih@gmail.com

about 2 months ago

2 attachment(s)

Hi,

I'd like to propose $SUBJECT to serialize additional per-entry data beyond
the standard statistics entries. Currently, custom statistics kinds can store
their standard entry data in the main "pgstat.stat" file, but there is no
mechanism for extensions to persist extra data stored in the entry. A common
use case is extensions that register a custom kind and, besides
standard counters,
need to track variable-length data stored in a dsa_pointer.

This proposal adds optional "to_serialized_extra" and
"from_serialized_extra" callbacks to "PgStat_KindInfo" that allow custom kinds
to write and read from extra data in a separate files
(pgstat.<kind>.stat). The callbacks
give extensions direct access to the file pointer so they can read and write
data in any format, while the core "pgstat" infrastructure manages
opening, closing, renaming, and cleanup, just as it does with "pgstat.stat".

A concrete use case is pg_stat_statements. If it were to use custom
stats kinds to track statement counters, it could also track query text
stored in DSA. The callbacks allow saving the query text referenced by the
dsa_pointer and restoring it after a clean shutdown. Since DSA
(and more specifically DSM) cannot be attached by the postmaster, an
extension cannot use "on_shmem_exit" or "shmem_startup_hook"
to serialize or restore this data. This is why pgstat handles
serialization during checkpointer shutdown and startup, allowing a single
backend to manage it safely.

I considered adding hooks to the existing pgstat code paths
(pgstat_before_server_shutdown, pgstat_discard_stats, and
pgstat_restore_stats), but that felt too unrestricted. Using per-kind
callbacks provides more control.

There are already "to_serialized_name" and "from_serialized_name"
callbacks used to store and read entries by "name" instead of
"PgStat_HashKey", currently used by replication slot stats. Those
remain unchanged, as they serve a separate purpose.

Other design points:

1. Filenames use "pgstat.<kind>.stat" based on the numeric kind ID.
This avoids requiring extensions to provide names and prevents issues
with spaces or special characters.

2. Both callbacks must be registered together. Serializing without
deserializing would leave orphaned files behind, and I cannot think of a
reason to allow this.

3. "write_chunk", "read_chunk", "write_chunk_s", and
"read_chunk_s" are renamed to "pgstat_write_chunk", etc., and
moved to "pgstat_internal.h" so extensions can use them without
re-implementing these functions.

4. These callbacks are valid only for custom, variable-numbered statistics
kinds. Custom fixed kinds may not benefit, but could be considered in the
future.

Attached 0001 is the proposed change, still in POC form. The second patch
contains tests in "injection_points" to demonstrate this proposal, and is not
necessarily intended for commit.

Looking forward to your feedback!

Sami Imseih
Amazon Web Services (AWS)

Michael Paquier

michael@paquier.xyz

about 2 months ago

In reply to: Sami Imseih (#1)

Re: [Proposal] Adding callback support for custom statistics kinds

On Wed, Oct 22, 2025 at 03:24:11PM -0500, Sami Imseih wrote:

I'd like to propose $SUBJECT to serialize additional per-entry data beyond
the standard statistics entries. Currently, custom statistics kinds can store
their standard entry data in the main "pgstat.stat" file, but there is no
mechanism for extensions to persist extra data stored in the entry. A common
use case is extensions that register a custom kind and, besides
standard counters,
need to track variable-length data stored in a dsa_pointer.

Thanks for sending a proposal in this direction.

A concrete use case is pg_stat_statements. If it were to use custom
stats kinds to track statement counters, it could also track query text
stored in DSA. The callbacks allow saving the query text referenced by the
dsa_pointer and restoring it after a clean shutdown. Since DSA
(and more specifically DSM) cannot be attached by the postmaster, an
extension cannot use "on_shmem_exit" or "shmem_startup_hook"
to serialize or restore this data. This is why pgstat handles
serialization during checkpointer shutdown and startup, allowing a single
backend to manage it safely.

Agreed that it would be better to split the query text in a file of
its own and now bloat the "main" pgstats file with this data, A real
risk is that many PGSS entries with a bunch of queries would cause the
file to be just full of the PGSS contents.

I considered adding hooks to the existing pgstat code paths
(pgstat_before_server_shutdown, pgstat_discard_stats, and
pgstat_restore_stats), but that felt too unrestricted. Using per-kind
callbacks provides more control.

Per-kind callbacks to control all that makes sense here.

There are already "to_serialized_name" and "from_serialized_name"
callbacks used to store and read entries by "name" instead of
"PgStat_HashKey", currently used by replication slot stats. Those
remain unchanged, as they serve a separate purpose.

Other design points:

1. Filenames use "pgstat.<kind>.stat" based on the numeric kind ID.
This avoids requiring extensions to provide names and prevents issues
with spaces or special characters.

Hmm. Is that really what we want here? This pretty says that one
single custom kind would never be able use multiple files, ever.

2. Both callbacks must be registered together. Serializing without
deserializing would leave orphaned files behind, and I cannot think of a
reason to allow this.

Hmm. Okay.

3. "write_chunk", "read_chunk", "write_chunk_s", and
"read_chunk_s" are renamed to "pgstat_write_chunk", etc., and
moved to "pgstat_internal.h" so extensions can use them without
re-implementing these functions.

Exposing the write and read chunk APIs and renaming them sounds good
here, designed as they are now with a FILE* defined by the caller.
It's good to share these for consistency across custom and built-in
stats kinds.

4. These callbacks are valid only for custom, variable-numbered statistics
kinds. Custom fixed kinds may not benefit, but could be considered in the
future.

Pushing custom data for fixed-sized stats may be interesting, though
like you I am not sure what a good use-case would look like. So
discarding this case for now sounds fine to me.

Attached 0001 is the proposed change, still in POC form.

Hmm. I would like to propose something a bit more flexible,
refactoring and reusing some of the existing callbacks, among the
following lines:
- Rather than introducing a second callback able to do more
serialization work, let's expand a bit the responsibility of
to_serialized_name and from_serialized_name to be able to work in a
more extended way, renaming them to "to/from_serialized_entry", which
are now limited to return a NameData with pgstat.c enforcing the data
written to the pgstats to be of NAMEDATALEN. The idea would be to let
the callbacks push some custom data where they want.
- The to_serialized_name path of pgstat_write_statsfile() would then
be changed as follows:
-- push a PGSTAT_FILE_ENTRY_NAME
-- Write the key write_chunk_s.
-- Call the callback to push some custom per-entry data.
-- Finish with the main chunk of data, of size pgstat_get_entry_len().
- The fd or FILE* of the "main" pgstats file should be added as
argument of both routines (not mandatory, but we are likely going to
need that if we want to add more "custom" data in the main pgstats
file before writing or reading a chunk). For example, for a PGSS text
file, we would likely write two fields to the main data file: an
offset and a length to be able to retrieve a query string, from a
secondary file.
- FDs where the data is written while we are in the to/from serialize
can be handled within the code paths specific to the stats kind code.
The first time a serialized callback of a stats kind is called, the
extra file(s) is(are) opened. This may come at the cost of one new
callback: at the end of the read and writes of the stats data, we
would need an extra look that's able to perform cleanup actions, which
would be here to make sure that the fds opened for the extra files are
closed when we are done. The close of each file is equivalent to the
pgstat_close_file() done in the patch, except that we'd loop over a
callback that would do the cleanup job once we are done reading or
writing a file. One step that can be customized in this new "end"
callback is if a stats kind may decide to unlink() a previous file, as
we do for the main pgstats file, or keep one or more files around.
That would be up to the extension developer. We should be able to
reuse or rework reset_all_cb() with a status given to it, depending on
if we are dealing with a failure or a success path. Currently,
reset_all_cb() is only used in a failure path, the idea would be to
extend it for the success case.

The second patch
contains tests in "injection_points" to demonstrate this proposal, and is not
necessarily intended for commit.

Having coverage for these kinds of APIs is always good, IMO. We need
coverage for extension code.
--
Michael

Sami Imseih

samimseih@gmail.com

about 2 months ago

In reply to: Michael Paquier (#2)

Re: [Proposal] Adding callback support for custom statistics kinds

Thanks for the feedback!

Other design points:

1. Filenames use "pgstat.<kind>.stat" based on the numeric kind ID.
This avoids requiring extensions to provide names and prevents issues
with spaces or special characters.

Hmm. Is that really what we want here? This pretty says that one
single custom kind would never be able use multiple files, ever.

Perhaps if someone wants to have separate files for each different
types of data,
we should be able to support multiple files. I think we can add an
option for the
number of files and they can then be named "pgstat.<kind>.1.stat",
pgstat.<kind>.2.stat",
etc. I rather avoid having the extension provide a set of files names.
So as arguments to the callback, besides the main file pointer ( as
you mention below),
we also provide the list of custom file pointers.

what do you think?

Hmm. I would like to propose something a bit more flexible,
refactoring and reusing some of the existing callbacks, among the
following lines:
- Rather than introducing a second callback able to do more
serialization work, let's expand a bit the responsibility of
to_serialized_name and from_serialized_name to be able to work in a
more extended way, renaming them to "to/from_serialized_entry", which

Sure, we can go that route.

- The fd or FILE* of the "main" pgstats file should be added as
argument of both routines (not mandatory, but we are likely going to
need that if we want to add more "custom" data in the main pgstats
file before writing or reading a chunk). For example, for a PGSS text
file, we would likely write two fields to the main data file: an
offset and a length to be able to retrieve a query string, from a
secondary file.

Yeah, that could be a good idea for pg_s_s, if we don't want to store the key
alongside the query text. Make more sense.

- FDs where the data is written while we are in the to/from serialize
can be handled within the code paths specific to the stats kind code.
The first time a serialized callback of a stats kind is called, the
extra file(s) is(are) opened. This may come at the cost of one new
callback: at the end of the read and writes of the stats data, we
would need an extra look that's able to perform cleanup actions, which
would be here to make sure that the fds opened for the extra files are
closed when we are done. The close of each file is equivalent to the
pgstat_close_file() done in the patch, except that we'd loop over a
callback that would do the cleanup job once we are done reading or
writing a file. One step that can be customized in this new "end"
callback is if a stats kind may decide to unlink() a previous file, as
we do for the main pgstats file, or keep one or more files around.
That would be up to the extension developer. We should be able to
reuse or rework reset_all_cb() with a status given to it, depending on
if we are dealing with a failure or a success path. Currently,
reset_all_cb() is only used in a failure path, the idea would be to
extend it for the success case.

I will provide a patch with the recommendations.

--
Sami

Michael Paquier

michael@paquier.xyz

about 2 months ago

In reply to: Sami Imseih (#3)

Re: [Proposal] Adding callback support for custom statistics kinds

On Thu, Oct 23, 2025 at 04:35:58PM -0500, Sami Imseih wrote:

Perhaps if someone wants to have separate files for each different
types of data,
we should be able to support multiple files. I think we can add an
option for the
number of files and they can then be named "pgstat.<kind>.1.stat",
pgstat.<kind>.2.stat",
etc. I rather avoid having the extension provide a set of files names.
So as arguments to the callback, besides the main file pointer ( as
you mention below),
we also provide the list of custom file pointers.

what do you think?

My worry here is the lack of flexibility regarding stats that could be
split depending on the objects whose data needs to be flushed. For
example, stats split across multiple databases (like our good-old
pre-v14 pgstats, but on a per-kind basis). So I don't think that we
can really assume that the list of file names should be fixed when we
begin the read/write process of the main pgstats file.
--
Michael

Sami Imseih

samimseih@gmail.com

about 2 months ago

In reply to: Michael Paquier (#4)

Re: [Proposal] Adding callback support for custom statistics kinds

On Thu, Oct 23, 2025 at 04:35:58PM -0500, Sami Imseih wrote:

Perhaps if someone wants to have separate files for each different
types of data,
we should be able to support multiple files. I think we can add an
option for the
number of files and they can then be named "pgstat.<kind>.1.stat",
pgstat.<kind>.2.stat",
etc. I rather avoid having the extension provide a set of files names.
So as arguments to the callback, besides the main file pointer ( as
you mention below),
we also provide the list of custom file pointers.

what do you think?

My worry here is the lack of flexibility regarding stats that could be
split depending on the objects whose data needs to be flushed. For
example, stats split across multiple databases (like our good-old
pre-v14 pgstats, but on a per-kind basis). So I don't think that we
can really assume that the list of file names should be fixed when we
begin the read/write process of the main pgstats file.

I was trying to avoid an extra field in PgStat_KindInfo if possible, but
it's worthwhile to provide more flexibility to an extension. I will go
with this.

--
Sami Imseih
Amazon Web Services (AWS)

Michael Paquier

michael@paquier.xyz

about 2 months ago

In reply to: Sami Imseih (#5)

Re: [Proposal] Adding callback support for custom statistics kinds

On Thu, Oct 23, 2025 at 07:57:38PM -0500, Sami Imseih wrote:

I was trying to avoid an extra field in PgStat_KindInfo if possible, but
it's worthwhile to provide more flexibility to an extension. I will go
with this.

Yes, I don't think that we will be able to avoid some refactoring of
the existing callbacks. The introduction of a new one may not be
completely necessary, though, especially if we reuse the reset
callback to be called when the stats read and write finish to close
any fds we may have opened when processing.

Maintaining the state of the files opened within each stat kind code
across multiple calls of the new "serialized" callback feels a bit
more natural and more flexible, at least it's my take on the matter.
--
Michael

Sami Imseih

samimseih@gmail.com

about 1 month ago

In reply to: Sami Imseih (#3)

Re: [Proposal] Adding callback support for custom statistics kinds

Hmm. I would like to propose something a bit more flexible,
refactoring and reusing some of the existing callbacks, among the
following lines:
- Rather than introducing a second callback able to do more
serialization work, let's expand a bit the responsibility of
to_serialized_name and from_serialized_name to be able to work in a
more extended way, renaming them to "to/from_serialized_entry", which

Sure, we can go that route.

I started reworking the patch, but then I realized that I don't like this
approach of using the same callback to support serializing NameData and
serializing extra data. In the existing "to_serialized_name" callback
, NameData is serialized instead of the hash key, meaning that the
"from_serialized_name" must be called before we create an entry. The
callback translates the NameData to an objid as is the case with replication
slots, and the key is then used to create the entry.

However, in the case of serializing extra data, we want to have already
created the entry by the time we call the callback. For example populating
non-key fields of an entry with a dsa_pointer after reading some serialized
data into dsa.

If we do want to support a single callback, we would need extra metadata in
the Kind registration to let the extension tell us what the callback is used
for and to either trigger the callback before or after entry creation. I am
not very thrilled about doing something like this, as I see 2 very different
use-cases here.

What do you think?

--
Sami Imseih
Amazon Web Services (AWS)

Michael Paquier

michael@paquier.xyz

about 1 month ago

In reply to: Sami Imseih (#7)

Re: [Proposal] Adding callback support for custom statistics kinds

On Mon, Nov 10, 2025 at 01:56:23PM -0600, Sami Imseih wrote:

I started reworking the patch, but then I realized that I don't like this
approach of using the same callback to support serializing NameData and
serializing extra data. In the existing "to_serialized_name" callback
, NameData is serialized instead of the hash key, meaning that the
"from_serialized_name" must be called before we create an entry. The
callback translates the NameData to an objid as is the case with replication
slots, and the key is then used to create the entry.

Thanks for looking at that.

However, in the case of serializing extra data, we want to have already
created the entry by the time we call the callback. For example populating
non-key fields of an entry with a dsa_pointer after reading some serialized
data into dsa.

If we do want to support a single callback, we would need extra metadata in
the Kind registration to let the extension tell us what the callback is used
for and to either trigger the callback before or after entry creation. I am
not very thrilled about doing something like this, as I see 2 very different
use-cases here.

Ah, I see your point. By keeping two callbacks, one to translate a
key to/from a different field (NameData currently, but it could be
something else with a different size), we would for example be able to
keep very simple the checks for duplicated entries when reading the
file. Agreed that it would be good to keep the key lookups as stable
as we can.

So, what you are suggested is a second callback once we have called
read_chunk() and write_chunk() for a PGSTAT_FILE_ENTRY_HASH or a
PGSTAT_FILE_ENTRY_NAME and let a stats kind write in the main file
and/or one or more extra files the data they want? I'd be fine with
that, yes, and that should work with the PGSS case in mind.
--
Michael

Sami Imseih

samimseih@gmail.com

21 days ago

In reply to: Michael Paquier (#8)

2 attachment(s)

Re: [Proposal] Adding callback support for custom statistics kinds

Sorry for the delay here.

v1 is the first attempt to address the feedback from the POC.

1/ A user is now able to register as many extra files as they
wish, and the files will be named pgstat.<kind_id>.<file no>.stat,
where file_no starts at 0 up to the number of files specified
by the user with .num_serialized_extra_files.

2/ The callbacks now provide both the core stat file as a FILE
pointer and an array of FILE pointers for the extra files.
IN the write callback, the extra file pointer is accessed
like extra_files[0], extra_files[1], etc., and the same for
the read callback.

3/ The patch centralizes the creation and cleanup of the files
with 2 new routines pgstat_allocate_files and pgstat_cleanup_files,
which both operate on a new local struct which tracks the file
names and descriptors in the read and write stats routines.

```
typedef struct PgStat_SerializeFiles
{
char **tmpfiles;
char **statfiles;
FILE **fd;
int num_files;
} PgStat_SerializeFiles;
```

plug-ins are not made aware of this struct because they don't need
to. The callbacks are provided the FILE pointers they need to care
about for their kind only.

4/ In terms of testing, patch 0002, I did not want to invent a new module
for custom kinds, so I piggybacked off og injection_points as I did in the
POC, but I added on the existing recovery tests, because essentially that
is what we care. Does the data persist after a clean shutdown? do the
.tmp files get removed properly? etc. So I added tests in
recovery/t/029_stats_restart.pl for this.

--
Sami Imseih
Amazon Web Services (AWS)

#10

Sami Imseih

samimseih@gmail.com

21 days ago

In reply to: Sami Imseih (#9)

Re: [Proposal] Adding callback support for custom statistics kinds

It just occurred to me that the documentation [0]https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-ADDIN-CUSTOM-CUMULATIVE-STATISTICS should be
updated to describe the callbacks. I will do that in the next
revision.

[0]: https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-ADDIN-CUSTOM-CUMULATIVE-STATISTICS

--
Sami Imseih
Amazon Web Services (AWS)

#11

Michael Paquier

michael@paquier.xyz

17 days ago

In reply to: Sami Imseih (#10)

Re: [Proposal] Adding callback support for custom statistics kinds

On Wed, Nov 19, 2025 at 08:10:43PM -0600, Sami Imseih wrote:

It just occurred to me that the documentation [0] should be
updated to describe the callbacks. I will do that in the next
revision.

[0] https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-ADDIN-CUSTOM-CUMULATIVE-STATISTICS

Hmm.  Based on what I can read from the patch, you are still enforcing
file name patterns in the backend, as of:
+          extra->statfiles[i] = psprintf("%s/pgstat.%d.%d.stat",
+                        PGSTAT_STAT_PERMANENT_DIRECTORY, kind, i);

My take (also mentioned upthread) is that this design should go the
other way around, where modules have the possibility to define their
own file names, and some of them could be generated on-the-fly when
writing the files, for example for a per-file database split, or the
object ID itself.

The important part for variable-numbered stats is that the keys of the
entries have to be in the main pgstats file. Then, the extra data is
loaded back based on the data in the entry key, based on a file name
that only a custom stats kind knows about (fd and file name). It
means that the custom stats kind needs to track the files it has to
clean up by itself in this scheme. We could pass back to the startup
process some fds that it cleans up, but it feels simpler here to let
the custom code do what they want, instead, rather than having an
array that tracks the file names and/or their fds.
--
Michael

#12

Sami Imseih

samimseih@gmail.com

17 days ago

In reply to: Michael Paquier (#11)

Re: [Proposal] Adding callback support for custom statistics kinds

It just occurred to me that the documentation [0] should be
updated to describe the callbacks. I will do that in the next
revision.

[0] https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-ADDIN-CUSTOM-CUMULATIVE-STATISTICS
Hmm.  Based on what I can read from the patch, you are still enforcing
file name patterns in the backend, as of:
+          extra->statfiles[i] = psprintf("%s/pgstat.%d.%d.stat",
+                        PGSTAT_STAT_PERMANENT_DIRECTORY, kind, i);
My take (also mentioned upthread) is that this design should go the
other way around, where modules have the possibility to define their
own file names, and some of them could be generated on-the-fly when
writing the files, for example for a per-file database split, or the
object ID itself.

The way I thought about it is that extension developer can just provide the
number of files they need and the they are then given a list of
file pointers that they need. They can then manage what each file is
used for. They also don't need to worry about naming the files, all they
need to do is track what each file in the list does.

The important part for variable-numbered stats is that the keys of the
entries have to be in the main pgstats file. Then, the extra data is
loaded back based on the data in the entry key, based on a file name
that only a custom stats kind knows about (fd and file name). It
means that the custom stats kind needs to track the files it has to
clean up by itself in this scheme. We could pass back to the startup
process some fds that it cleans up, but it feels simpler here to let
the custom code do what they want, instead, rather than having an
array that tracks the file names and/or their fds.

yeah, I was leaning towards putting more responsibility on pgstat to
manage these extra files, but you are suggesting that we just let the
extension manage the create/cleanup of these files as well.

After re-reading your earlier suggestions, this sounds like a third
callback that is used for file cleanup, and this callback could be
the existing reset_all_cb. Also, instead of reset_all_cb being called
during pgstat_reset_after_failure, it can be called during the success
case, i.e, a new pgstat_reset_after_success. reset_all_cb also
carries a status argument so the extension knows what to do
in the case of success or failure.

This also means we need to also update all existing callbacks to
do work in the failed status.

Is that correct?

--
Sami

#13

Sami Imseih

samimseih@gmail.com

16 days ago

In reply to: Sami Imseih (#12)

Re: [Proposal] Adding callback support for custom statistics kinds

After re-reading your earlier suggestions, this sounds like a third
callback that is used for file cleanup, and this callback could be
the existing reset_all_cb. Also, instead of reset_all_cb being called
during pgstat_reset_after_failure, it can be called during the success
case, i.e, a new pgstat_reset_after_success. reset_all_cb also
carries a status argument so the extension knows what to do
in the case of success or failure.

This also means we need to also update all existing callbacks to
do work in the failed status.

After second thought, I am not too thrilled with extending reset_all_cb
to take responsibility for file cleanup, etc. I think it should just remain
used to reset stats only.

I think the best way forward will be to introduce a callback to be used by
custom kinds only. This callback will be responsible for cleaning up files
and related resources at the end of the write stats, read stats, and discard
stats paths. The callback will provide back to the extension a status
(READ, WRITE, DISCARD) and the extension will know how to clean up the
resources it created depending on the situation.

So, unlike my original proposal, this puts more responsibility on the
extension to track and clean up its files, but this seems like the best
approach to take here.

Also, I am now leaning towards creating a separate test module rather than
trying to do too much unrelated testing in injection points. It is definitely
convenient to use injection points, but I think we can do better testing with
a separate module. This module can also serve as an example for extension
developers.

what do you think?

--
Sami Imseih
Amazon Web Services (AWS)

#14

Michael Paquier

michael@paquier.xyz

16 days ago

In reply to: Sami Imseih (#13)

Re: [Proposal] Adding callback support for custom statistics kinds

On Mon, Nov 24, 2025 at 06:18:26PM -0600, Sami Imseih wrote:

After second thought, I am not too thrilled with extending reset_all_cb
to take responsibility for file cleanup, etc. I think it should just remain
used to reset stats only.

I think the best way forward will be to introduce a callback to be used by
custom kinds only. This callback will be responsible for cleaning up files
and related resources at the end of the write stats, read stats, and discard
stats paths. The callback will provide back to the extension a status
(READ, WRITE, DISCARD) and the extension will know how to clean up the
resources it created depending on the situation.

I guess that READ and WRITE are the cases that happen on success of
these respective operations. DISCARD is the failure case when one of
these fail.

So, unlike my original proposal, this puts more responsibility on the
extension to track and clean up its files, but this seems like the best
approach to take here.

That may be something we should do anyway. It means that the modules
are responsible for the tracking the file(s) they open, still they
could also decide operations different than the backend for the main
pgstats file, optionally, depending on the state of the reads and
writes (aka success or failure of these).

Also, I am now leaning towards creating a separate test module rather than
trying to do too much unrelated testing in injection points. It is definitely
convenient to use injection points, but I think we can do better testing with
a separate module. This module can also serve as an example for extension
developers.

You are right that it may be cleaner this way. Do you think that it
could make sense to move some of the existing "template" code of
injection_points there?

One part of the proposed patch that felt independent to me was the
renaming and publishing of the two write/read routines for the stats
files, so I have extracted that in your first patch to reduce the
blast, and applied that as it can also be useful on its own.
--
Michael

#15

Sami Imseih

samimseih@gmail.com

8 days ago

In reply to: Michael Paquier (#14)

2 attachment(s)

Re: [Proposal] Adding callback support for custom statistics kinds

Also, I am now leaning towards creating a separate test module rather than
trying to do too much unrelated testing in injection points. It is definitely
convenient to use injection points, but I think we can do better testing with
a separate module. This module can also serve as an example for extension
developers.

You are right that it may be cleaner this way. Do you think that it
could make sense to move some of the existing "template" code of
injection_points there?

By "template" code, do you mean Something like?

include/utils/custom_statkinds.h
backend/utils/misc/custom_statkinds.c

Where the template code here is PgStat_kind definition, callbacks, etc. for
injection_points or the new test module that is using a custom kind.

A few benefits I see for this is we can point extension developers to
this as an example in [0]https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-ADDIN-CUSTOM-CUMULATIVE-STATISTICS and we are also maintaining the kind ids in
a single place. These may not be strong points, but may be worth while.

v2 attached is something that may be closer to what we've been discussing

v2-0001 are much simplified changes to pgstat.c that simply invoke the callbacks
and all the work is on the extension to implement what it needs to do.
This includes
a callback at the end of WRITE, READ, DISCARD with a flag passed to the caller
so they can perform the necessary clean-up actions.

v2-0002 implements a new test module that tests mainly that the recovery,
clean and crash, are working as expected.

I created a new tap test for this which performs a test similar to what is
done in recovery/029_stats_restart.pl. I could merge the new test there, but
I am reluctant to add a dependency on a new module to recovery. What
do you think?

[0]: https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-ADDIN-CUSTOM-CUMULATIVE-STATISTICS

--
Sami Imseih
Amazon Web Services (AWS)

#16

Michael Paquier

michael@paquier.xyz

8 days ago

In reply to: Sami Imseih (#15)

Re: [Proposal] Adding callback support for custom statistics kinds

On Tue, Dec 02, 2025 at 02:58:32PM -0600, Sami Imseih wrote:

By "template" code, do you mean Something like?

include/utils/custom_statkinds.h
backend/utils/misc/custom_statkinds.c

Where the template code here is PgStat_kind definition, callbacks, etc. for
injection_points or the new test module that is using a custom kind.

I am not completely sure that we need a separate C file for a portion
of the code related to custom stats kinds. At least, I am not sure to
see which part of pgstat.c, pgstat_internal.h and pgstat_shmem.c could
be extracted so as a custom_statkinds.c could have value when taken
independently. A test module makes the most sense for such templates
IMO, as they can be compiled and checked directly.

v2-0001 are much simplified changes to pgstat.c that simply invoke the callbacks
and all the work is on the extension to implement what it needs to do.
This includes
a callback at the end of WRITE, READ, DISCARD with a flag passed to the caller
so they can perform the necessary clean-up actions.

+	void		(*to_serialized_extra_stats) (const PgStat_HashKey *key,
+											  const PgStatShared_Common *header, FILE *statfile);
+	void		(*from_serialized_extra_stats) (const PgStat_HashKey *key,
+												const PgStatShared_Common *header, FILE *statfile);
+	void		(*end_extra_stats) (PgStat_StatsFileOp status);
[...]
+typedef struct PgStatShared_CustomEntry
+{
+	PgStatShared_Common header;
+	PgStat_StatCustomEntry stats;
+	char		name[NAMEDATALEN];
+	dsa_pointer description;
+}			PgStatShared_CustomEntry;

I'm cool with this design, including your point about using a DSA
pointer in a stats entry, manipulating this data through the
serialization callback. Your module does not use the FILE* which
points to the main stats file for the to/from extra serialized
callbacks, it seems important to document in pgstat_internal.h that
this points to the "main" pgstats file manipulated by the startup
process when loading or by the checkpointer when flushing.

Perhaps the callback in the module for end_extra_stats should use a
switch based on PgStat_StatsFileOp. Minor point.

+/* File handle for statistics serialization */
+static FILE *fd = NULL;

Using a fd tracked directly by the module sounds good to me, yes.
That gives to the modules the flexibility to decide what should be the
extra files to know about, some file name patterns being possible to
decide based on the stats entry keys that need to be written, with
files opened when actually required.

v2-0002 implements a new test module that tests mainly that the recovery,
clean and crash, are working as expected.

That looks like a good direction to me. The only differences I can
see with the stats module in injection_points for variable-sized stats
is that this new module does not check pgstat_drop_entry() and
pgstat_fetch_entry() when working on a custom stats kind. If we had
SQL interfaces calling these two, we could just remove
injection_stats.c entirely, moving everything to this new test module.
I should have invented a new module from the start, perhaps, but well,
that was good enough to check the basic APIs when working on the
custom APIs. Removing this duplication would be my own business with
your module in the tree, no need for you to worry about that. That
would also remove the tweak you have used regarding the duplicated
kind ID.

Perhaps we should do the same for the fixed-sized kind at the end, and
instead of using one .so for both of them, we could just create a
separate .so with multiple entries in MODULES? What do you think?
What you have here is better than what's in the tree in terms of
module separation for HEAD.

I created a new tap test for this which performs a test similar to what is
done in recovery/029_stats_restart.pl. I could merge the new test there, but
I am reluctant to add a dependency on a new module to recovery. What
do you think?

Adding an extra item to recovery's EXTRA_INSTALL would be OK for me,
but it seems cleaner to me to keep the tests related to custom stats
in their own area like your patch 0002 is doing with its new test
module test_custom_statkind. And 029_stats_restart.pl is already
covering a lot of ground.

+     if (pgstat_is_kind_custom(key.kind) && kind_info->from_serialized_extra_stats)
+         kind_info->from_serialized_extra_stats(&key, header, fpin);
[...]
+     if (pgstat_is_kind_custom(ps->key.kind) && kind_info->to_serialized_extra_stats)
+         kind_info->to_serialized_extra_stats(&ps->key, shstats, fpout);

These restrictions based on custom kinds do not seem mandatory.
Why not allowing built-in kinds the same set of operations?

+	/* Read and verify the hash key */
+	if (!pgstat_read_chunk(fd, (void *) key, sizeof(PgStat_HashKey)))
+		return;
[...]
+	/* Write the hash key to identify this entry */
+	pgstat_write_chunk(fd, (void *) key, sizeof(PgStat_HashKey));

I am puzzled by this part of 0002. Why are you overwriting the key
once after loading it from the main pgstats file? Writing the key to
cross-check that the data matches with what is in the main file is OK,
and this should be ensured because of the ordering of the data. I
would have done it in a slightly different way, I guess, with the data
stored on disk in the main pgstats file including an offset to know
where to search in the secondary file. That's what we would do for
PGSS as well, I guess, with the secondary file including data
structured as a set of:
- Entry key, cross-checked with the data read from the main file,
based on the offset stored in the main file.
- Length of extra data.
- The extra data contents.

As a whole, I find this patch pretty cool, particularly the point
about extending stats entries with DSAs, something that would be
essential for PGSS and move it to use pgstats because we don't want
the query strings in the main pgstats file and bloat it. Nice.
--
Michael

#17

Chao Li

li.evan.chao@gmail.com

8 days ago

In reply to: Sami Imseih (#15)

Re: [Proposal] Adding callback support for custom statistics kinds

On Dec 3, 2025, at 04:58, Sami Imseih <samimseih@gmail.com> wrote:

Also, I am now leaning towards creating a separate test module rather than
trying to do too much unrelated testing in injection points. It is definitely
convenient to use injection points, but I think we can do better testing with
a separate module. This module can also serve as an example for extension
developers.

You are right that it may be cleaner this way. Do you think that it
could make sense to move some of the existing "template" code of
injection_points there?

By "template" code, do you mean Something like?

include/utils/custom_statkinds.h
backend/utils/misc/custom_statkinds.c

Where the template code here is PgStat_kind definition, callbacks, etc. for
injection_points or the new test module that is using a custom kind.

A few benefits I see for this is we can point extension developers to
this as an example in [0] and we are also maintaining the kind ids in
a single place. These may not be strong points, but may be worth while.

v2 attached is something that may be closer to what we've been discussing

v2-0001 are much simplified changes to pgstat.c that simply invoke the callbacks
and all the work is on the extension to implement what it needs to do.
This includes
a callback at the end of WRITE, READ, DISCARD with a flag passed to the caller
so they can perform the necessary clean-up actions.

v2-0002 implements a new test module that tests mainly that the recovery,
clean and crash, are working as expected.

I created a new tap test for this which performs a test similar to what is
done in recovery/029_stats_restart.pl. I could merge the new test there, but
I am reluctant to add a dependency on a new module to recovery. What
do you think?

[0] https://www.postgresql.org/docs/current/xfunc-c.html#XFUNC-ADDIN-CUSTOM-CUMULATIVE-STATISTICS

--
Sami Imseih
Amazon Web Services (AWS)
<v2-0002-Tests-for-custom-stat-kinds.patch><v2-0001-pgstat-support-custom-serialization-files-and-cal.patch>

Thanks for the patch, I do think the feature will be useful. After reading the patch, I got a concern on the design:

This patch provides callbacks that requests (also allows) custom extensions to write stat files on their own behalf, which I think it’s unsafe. The problems coming out to my head includes:

* An extension can write to any where on the storage, that what about it writes to /tmp and the files are deleted by other process or by a user manually incidentally?
* pgstat has a pattern of writing files like: writing to tmp file first, then durable_rename(), how to ensure extensions to do the same pattern? Without this pattern, how to ensure reliability of stat files?
* In the current path, pgstat performs its own write, then call callbacks. What about if a callback fails? Will that leave pgstat in a stale state?
* As extensions own file creation and deletion, in some case, staled file might be left on storage, who will be responsible for cleaning up them?

Given the goal of the feature is to allow extensions to serialize custom data, the callback should just return serialized/deserialized data, maybe together with some metadata, then pgstat should be responsible for writing the data. In other words, IMO, pgstat should always own stat files.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#18

Michael Paquier

michael@paquier.xyz

8 days ago

In reply to: Chao Li (#17)

Re: [Proposal] Adding callback support for custom statistics kinds

On Wed, Dec 03, 2025 at 01:41:44PM +0800, Chao Li wrote:

Thanks for the patch, I do think the feature will be useful. After reading the patch, I got a concern on the design:

This patch provides callbacks that requests (also allows) custom
extensions to write stat files on their own behalf, which I think
it’s unsafe. The problems coming out to my head includes:

* An extension can write to any where on the storage, that what
* about it writes to /tmp and the files are deleted by other process
* or by a user manually incidentally?

I mean, just don't do that. It's up to the extension developer to
decide what is safe or not, within the scope of the data folder.

* pgstat has a pattern of writing files like: writing to tmp file
* first, then durable_rename(), how to ensure extensions to do the
* same pattern? Without this pattern, how to ensure reliability of
* stat files?

Extension code would be responsible for ensuring that.

* In the current path, pgstat performs its own write, then call
* callbacks. What about if a callback fails? Will that leave pgstat
* in a stale state?

For the write state, end_extra_stats() would take care of that. It
depends on what kind of errors you would need to deal with, but as
proposed the patch would offer the same level of protection for the
writes of the stats, where we'd check for an error on the fd saved by
an extension for an extra file.

I think that you have a fair point about the stats read path though,
shouldn't we make the callback from_serialized_extra_stats() return a
state to be able to trigger a full-scale cleanup, at least?

* As extensions own file creation and deletion, in some case, staled
* file might be left on storage, who will be responsible for
* cleaning up them?

The extension should be able to handle that, I guess.

Given the goal of the feature is to allow extensions to serialize
custom data, the callback should just return serialized/deserialized
data, maybe together with some metadata, then pgstat should be
responsible for writing the data. In other words, IMO, pgstat should
always own stat files.

That's where my view of the matter differs, actually, pushing down the
responsibility into the extension code itself. A key argument,
mentioned upthread, is that the file paths could depend on the stats
entry *keys*, which may not be known in advance when beginning the
flush of the stats. Think about per-database file stats, or just
some per-object file stats, for example, which is an option that would
matter so as we do not bloat the main pgstats file.
--
Michael

#19

Chao Li

li.evan.chao@gmail.com

8 days ago

In reply to: Michael Paquier (#18)

Re: [Proposal] Adding callback support for custom statistics kinds

On Dec 3, 2025, at 13:54, Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Dec 03, 2025 at 01:41:44PM +0800, Chao Li wrote:

Thanks for the patch, I do think the feature will be useful. After reading the patch, I got a concern on the design:

This patch provides callbacks that requests (also allows) custom
extensions to write stat files on their own behalf, which I think
it’s unsafe. The problems coming out to my head includes:

* An extension can write to any where on the storage, that what
* about it writes to /tmp and the files are deleted by other process
* or by a user manually incidentally?

I mean, just don't do that. It's up to the extension developer to
decide what is safe or not, within the scope of the data folder.

* pgstat has a pattern of writing files like: writing to tmp file
* first, then durable_rename(), how to ensure extensions to do the
* same pattern? Without this pattern, how to ensure reliability of
* stat files?

Extension code would be responsible for ensuring that.

* In the current path, pgstat performs its own write, then call
* callbacks. What about if a callback fails? Will that leave pgstat
* in a stale state?

For the write state, end_extra_stats() would take care of that. It
depends on what kind of errors you would need to deal with, but as
proposed the patch would offer the same level of protection for the
writes of the stats, where we'd check for an error on the fd saved by
an extension for an extra file.

I think that you have a fair point about the stats read path though,
shouldn't we make the callback from_serialized_extra_stats() return a
state to be able to trigger a full-scale cleanup, at least?

* As extensions own file creation and deletion, in some case, staled
* file might be left on storage, who will be responsible for
* cleaning up them?

The extension should be able to handle that, I guess.

Yes, they of course can do, but that’s out of pgstat’s control. How can we ensure that?

Given the goal of the feature is to allow extensions to serialize
custom data, the callback should just return serialized/deserialized
data, maybe together with some metadata, then pgstat should be
responsible for writing the data. In other words, IMO, pgstat should
always own stat files.

That's where my view of the matter differs, actually, pushing down the
responsibility into the extension code itself. A key argument,
mentioned upthread, is that the file paths could depend on the stats
entry *keys*, which may not be known in advance when beginning the
flush of the stats. Think about per-database file stats, or just
some per-object file stats, for example, which is an option that would
matter so as we do not bloat the main pgstats file.
--
Michael

If we push down the responsibility into the extension code, then all extensions that want to enjoy the callbacks have to handle the same complexities of dealing with stat files, which sounds big duplicate efforts.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#20

Sami Imseih

samimseih@gmail.com

6 days ago

In reply to: Chao Li (#19)

Re: [Proposal] Adding callback support for custom statistics kinds

If we push down the responsibility into the extension code, then all extensions
that want to enjoy the callbacks have to handle the same complexities of dealing
with stat files, which sounds big duplicate efforts.

Thanks for the input! Yes, this is a trade-off between putting
responsibility on the
extension vs core. The initial thought I had was exactly like yours, but it will
be easier to get something pushed if we make the core changes as minimal as
possible. If there are enough complaints in the future, this can be revisited.
Particularly if there is a common patterns for file cleanup, this
could be turned
into a core utility.

That looks like a good direction to me. The only differences I can
see with the stats module in injection_points for variable-sized stats
is that this new module does not check pgstat_drop_entry() and
pgstat_fetch_entry() when working on a custom stats kind. If we had
SQL interfaces calling these two, we could just remove
injection_stats.c entirely, moving everything to this new test module.

I should have invented a new module from the start, perhaps, but well,
that was good enough to check the basic APIs when working on the
custom APIs. Removing this duplication would be my own business with
your module in the tree, no need for you to worry about that. That
would also remove the tweak you have used regarding the duplicated
kind ID.

I plan on addressing the other comments.

However, as discussed off-list, I do think moving the custom kind tests from
injection points to the new test module is a prerequisite. I rather
not have us
push a new test module that is doing duplicate work as the injection
stats tests.
I worked on this refactoring today and plan to have a patch ready for review
by tomorrow.

--
Sami Imseih
Amazon Web Services (AWS)

#21

Michael Paquier

michael@paquier.xyz

6 days ago

In reply to: Sami Imseih (#20)

Re: [Proposal] Adding callback support for custom statistics kinds

On Thu, Dec 04, 2025 at 05:00:10PM -0600, Sami Imseih wrote:

Thanks for the input! Yes, this is a trade-off between putting
responsibility on the
extension vs core. The initial thought I had was exactly like yours, but it will
be easier to get something pushed if we make the core changes as minimal as
possible. If there are enough complaints in the future, this can be revisited.
Particularly if there is a common patterns for file cleanup, this
could be turned
into a core utility.

Another way to shape it would be to have an in-core routine that
provides a default logic for the actions to take depending on the
write, read or discard state, with the state and a FILE* as arguments.
The main pgstats file would call that, modules may decide to use it.

However, as discussed off-list, I do think moving the custom kind tests from
injection points to the new test module is a prerequisite. I rather
not have us push a new test module that is doing duplicate work as the injection
stats tests.
I worked on this refactoring today and plan to have a patch ready for review
by tomorrow.

Cool, thanks!
--
Michael

#22

Sami Imseih

samimseih@gmail.com

5 days ago

In reply to: Michael Paquier (#21)

1 attachment(s)

Re: [Proposal] Adding callback support for custom statistics kinds

However, as discussed off-list, I do think moving the custom kind tests from
injection points to the new test module is a prerequisite. I rather
not have us push a new test module that is doing duplicate work as the injection
stats tests.
I worked on this refactoring today and plan to have a patch ready for review
by tomorrow.

Cool, thanks!

Attached is the new test module that replaces the custom statistics
tests currently in the injection points tests. Under test_custom_stats, there
are two separate modules: one for variable-amount stats and one for
fixed-amount stats. With this, we can completely remove the
stats-related tests and supporting code under
src/test/modules/injection_points/.

A few notes on the tests:

1. Variable stats: pgstat_drop_entry() and pgstat_fetch_entry() are
exercised here, addressing an earlier point raised in the thread.

2. Fixed-amount stats: I added specific tests for reset behavior; both
during crash recovery and during manual resets.

3. In test_custom_fixed_stats.c, you will see this comment:
```
/* see explanation above PgStatShared_Archiver for the reset protocol */
LWLockAcquire(&stats_shmem->lock, LW_EXCLUSIVE);
```
This is intentional, as the reset protocol is documented at the
referenced location [0]https://github.com/postgres/postgres/blob/master/src/include/utils/pgstat_internal.h#L362-L382. I wanted to call that out for the patch review.

Once this gets pushed, it will simplify the remaining work needed
for the remaining serialization callbacks work.

[0]: https://github.com/postgres/postgres/blob/master/src/include/utils/pgstat_internal.h#L362-L382

--
Sami Imseih
Amazon Web Services (AWS)

#23

Sami Imseih

samimseih@gmail.com

3 days ago

In reply to: Sami Imseih (#22)

2 attachment(s)

Re: [Proposal] Adding callback support for custom statistics kinds

I took a look at the earlier comments.

+    /* Read and verify the hash key */
+    if (!pgstat_read_chunk(fd, (void *) key, sizeof(PgStat_HashKey)))
+        return;
[...]
+    /* Write the hash key to identify this entry */
+    pgstat_write_chunk(fd, (void *) key, sizeof(PgStat_HashKey));

I am puzzled by this part of 0002. Why are you overwriting the key
once after loading it from the main pgstats file?

Yes, this is not necessary. I removed it.

I would have done it in a slightly different way, I guess, with the data
stored on disk in the main pgstats file including an offset to know
where to search in the secondary file.

that's a much better approach and the pattern we would want to use
going forward. Since this does not require we read the entires
back in the same order as written, so it's much more flexible.
I not did this in the test module.

Perhaps the callback in the module for end_extra_stats should use a
switch based on PgStat_StatsFileOp. Minor point.

Agree. Done.

* In the current path, pgstat performs its own write, then call
* callbacks. What about if a callback fails? Will that leave pgstat
* in a stale state?

For the write state, end_extra_stats() would take care of that. It
depends on what kind of errors you would need to deal with, but as
proposed the patch would offer the same level of protection for the
writes of the stats, where we'd check for an error on the fd saved by
an extension for an extra file.

I think that you have a fair point about the stats read path though,
shouldn't we make the callback from_serialized_extra_stats() return a
state to be able to trigger a full-scale cleanup, at least?

In this case, even if the callback does not return a state, the cleanup
will eventually occur at the end of the read, see

```
done:
/* First, cleanup the main stats file, PGSTAT_STAT_PERMANENT_FILENAME */
FreeFile(fpin);

elog(DEBUG2, "removing permanent stats file \"%s\"", statfile);
unlink(statfile);

/* Let each stats kind run its cleanup callback, if it provides one */
for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
{
const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);

if (kind_info && kind_info->end_extra_stats)
kind_info->end_extra_stats(STATS_READ);
}
```

However, this could also mean some entries could be read back correctly, and
others not, so maybe it's not such a good idea. So, I did what is suggested
and allow the callback to return a bool which will raise an error and trigger
the cleanup code.

+     if (pgstat_is_kind_custom(key.kind) && kind_info->from_serialized_extra_stats)
+         kind_info->from_serialized_extra_stats(&key, header, fpin);
[...]
+     if (pgstat_is_kind_custom(ps->key.kind) && kind_info->to_serialized_extra_stats)
+         kind_info->to_serialized_extra_stats(&ps->key, shstats, fpout);

These restrictions based on custom kinds do not seem mandatory.
Why not allowing built-in kinds the same set of operations?

No good reason not to. In fact, maybe a follow-up will be to move the
replslot to this infrastructure and remove reliance on PGSTAT_FILE_ENTRY_NAME.

attached is the v4 patch set which includes:

0001 - which is just moving the tests out of injection points into a new
test module. This is similar to v3 [0]/messages/by-id/CAA5RZ0sG2RUKg=OLY+6-e4q=X9rsLfK3pKn03d=RZQppEDR=Bg@mail.gmail.com.

0002 - Is the code changes to implement the callbacks and the necessary
tests in the new test module.

[0]: /messages/by-id/CAA5RZ0sG2RUKg=OLY+6-e4q=X9rsLfK3pKn03d=RZQppEDR=Bg@mail.gmail.com

--
Sami Imseih
Amazon Web Services (AWS)

#24

Michael Paquier

michael@paquier.xyz

3 days ago

In reply to: Sami Imseih (#22)

Re: [Proposal] Adding callback support for custom statistics kinds

On Fri, Dec 05, 2025 at 07:27:50PM -0600, Sami Imseih wrote:

Attached is the new test module that replaces the custom statistics
tests currently in the injection points tests. Under test_custom_stats, there
are two separate modules: one for variable-amount stats and one for
fixed-amount stats. With this, we can completely remove the
stats-related tests and supporting code under
src/test/modules/injection_points/.

Yes, thanks. Structurally, this is better and more flexible than what
we had originally, and I have noticed that you have copied the
original files while adding more comments and renaming a bit things:
the structure of the functions was exactly the same. Anyway, I have
worked on that for a good portion of the day, splitting the module
drop and the new module into two commits, and applied the result after
tweaking quite a few things in terms of names and comments (no
pgstat_*, a bit more "Var" and "Fixed", etc.), applying a much more
consistent set of names across the board for the functions and the
structures. This cleanup part is moved out of the way now, so that
you ease the introduction of the next pieces you are proposing.

The tests for the reset of fixed-sized stats was a nice addition,
indeed. If you have more areas that you think could be improved,
ideas are of course welcome.
--
Michael

#25

Sami Imseih

samimseih@gmail.com

2 days ago

In reply to: Michael Paquier (#24)

1 attachment(s)

Re: [Proposal] Adding callback support for custom statistics kinds

Yes, thanks. Structurally, this is better and more flexible than what
we had originally, and I have noticed that you have copied the
original files while adding more comments and renaming a bit things:
the structure of the functions was exactly the same. Anyway, I have
worked on that for a good portion of the day, splitting the module
drop and the new module into two commits, and applied the result after
tweaking quite a few things in terms of names and comments (no
pgstat_*, a bit more "Var" and "Fixed", etc.), applying a much more
consistent set of names across the board for the functions and the
structures. This cleanup part is moved out of the way now, so that
you ease the introduction of the next pieces you are proposing.

Thanks for getting these committed!

I rebased the custom callbacks patch in v5.

One very minor thing from the earlier commits that I corrected here is
the test for entry 2 after a clean restart.

-is($result, "entry1|2", "variable-sized stats persist after clean restart");
+is($result, "entry1|2|Test entry 1", "variable-sized stats persist
after clean restart");
+
+$result = $node->safe_psql('postgres', q(select * from
test_custom_stats_var_report('entry2')));
+is($result, "entry2|3|Test entry 2", "variable-sized stats persist
after clean restart");
+

--
Sami Imseih
Amazon Web Services (AWS)

#26

Chao Li

li.evan.chao@gmail.com

2 days ago

In reply to: Sami Imseih (#25)

Re: [Proposal] Adding callback support for custom statistics kinds

On Dec 9, 2025, at 03:09, Sami Imseih <samimseih@gmail.com> wrote:

Yes, thanks. Structurally, this is better and more flexible than what
we had originally, and I have noticed that you have copied the
original files while adding more comments and renaming a bit things:
the structure of the functions was exactly the same. Anyway, I have
worked on that for a good portion of the day, splitting the module
drop and the new module into two commits, and applied the result after
tweaking quite a few things in terms of names and comments (no
pgstat_*, a bit more "Var" and "Fixed", etc.), applying a much more
consistent set of names across the board for the functions and the
structures. This cleanup part is moved out of the way now, so that
you ease the introduction of the next pieces you are proposing.

Thanks for getting these committed!

I rebased the custom callbacks patch in v5.

One very minor thing from the earlier commits that I corrected here is
the test for entry 2 after a clean restart.
-is($result, "entry1|2", "variable-sized stats persist after clean restart");
+is($result, "entry1|2|Test entry 1", "variable-sized stats persist
after clean restart");
+
+$result = $node->safe_psql('postgres', q(select * from
test_custom_stats_var_report('entry2')));
+is($result, "entry2|3|Test entry 2", "variable-sized stats persist
after clean restart");
+
--
Sami Imseih
Amazon Web Services (AWS)
<v5-0001-Allow-cumulative-statistics-to-serialize-auxiliar.patch>

```
+					if (kind_info->from_serialized_extra_stats)
+					{
+						if (!kind_info->from_serialized_extra_stats(&key, header, fpin))
+						{
+							elog(WARNING, "could not read extra stats for entry %u/%u/%" PRIu64,
+								 key.kind, key.dboid, key.objid);
+							goto error;
+						}
+					}
```

When deserialize failed, it goes to error. In the error clause, it calls pgstat_reset_after_failure(), so do we want to give extensions a chance to do some reset operations? If yes, then we can add a reset_after_failure() callback.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#27

Sami Imseih

samimseih@gmail.com

2 days ago

In reply to: Chao Li (#26)

Re: [Proposal] Adding callback support for custom statistics kinds

+                                       if (kind_info->from_serialized_extra_stats)
+                                       {
+                                               if (!kind_info->from_serialized_extra_stats(&key, header, fpin))
+                                               {
+                                                       elog(WARNING, "could not read extra stats for entry %u/%u/%" PRIu64,
+                                                                key.kind, key.dboid, key.objid);
+                                                       goto error;
+                                               }
+                                       }
```

The way v5 is dealing with a deserialize failure is that when
it goes to error, the pgstat_reset_after_failure() will reset the
stats for all kinds, since pgstat_drop_all_entries() is called
during that call. So there is nothing for an extension to have
to do on its own. The extension will then clean-up resources
at the end when all the kinds are iterated over and
kind_info->end_extra_stats(STATS_READ) is called for each
kind.

Let me know if I'm still missing something?

--
Sami Imseih
Amazon Web Services (AWS)

#28

Michael Paquier

michael@paquier.xyz

2 days ago

In reply to: Sami Imseih (#27)

Re: [Proposal] Adding callback support for custom statistics kinds

On Mon, Dec 08, 2025 at 09:57:15PM -0600, Sami Imseih wrote:

The way v5 is dealing with a deserialize failure is that when
it goes to error, the pgstat_reset_after_failure() will reset the
stats for all kinds, since pgstat_drop_all_entries() is called
during that call. So there is nothing for an extension to have
to do on its own. The extension will then clean-up resources
at the end when all the kinds are iterated over and
kind_info->end_extra_stats(STATS_READ) is called for each
kind.

Let me know if I'm still missing something?

It seems to me that you are missing nothing here, and that Chao has
missed the fact that the end of pgstat_read_statsfile() does a "goto
done", meaning that we would take a round of
end_extra_stats(STATS_READ) to do all the cleanup after resetting all
the stats. That's what I would expect.

+static inline bool pgstat_check_extra_callbacks(PgStat_Kind kind);
[...]
@@ -645,6 +656,13 @@ pgstat_initialize(void)
+	/* Check a kind's extra-data callback setup */
+	for (PgStat_Kind kind = PGSTAT_KIND_BUILTIN_MIN; kind <= PGSTAT_KIND_BUILTIN_MAX; kind++)
+		if (!pgstat_check_extra_callbacks(kind))
+			ereport(ERROR,
+					errmsg("incomplete extra serialization callbacks for stats kind %d",
+						   kind));

Why does this part need to run each time a backend initializes its
access to pgstats? Shouldn't this happen only once when a stats kind
is registered? pgstat_register_kind() should be the only code path
that does such sanity checks.

By the way, checking that to_serialized_extra_stats and
kind_info->from_serialized_extra_stats need to be both defined is
fine as these are coupled together, but I am not following the reason
why end_extra_stats would need to be included in the set? For
example, a stats kind could decide to add some data to the main
pgstats file without creating extra files, hence they may not need to
define end_extra_stats.
--
Michael

#29

Chao Li

li.evan.chao@gmail.com

2 days ago

In reply to: Michael Paquier (#28)

Re: [Proposal] Adding callback support for custom statistics kinds

On Dec 9, 2025, at 12:45, Michael Paquier <michael@paquier.xyz> wrote:

It seems to me that you are missing nothing here, and that Chao has
missed the fact that the end of pgstat_read_statsfile() does a "goto
done", meaning that we would take a round of

No, I didn’t miss that part. But in the “done” clause:

```
done:
/* First, cleanup the main stats file, PGSTAT_STAT_PERMANENT_FILENAME */
FreeFile(fpin);

elog(DEBUG2, "removing permanent stats file \"%s\"", statfile);
unlink(statfile);

if (kind_info && kind_info->end_extra_stats)
kind_info->end_extra_stats(STATS_READ);
}
```

end_extra_stats(STATS_READ) has no failure indication.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#30

Chao Li

li.evan.chao@gmail.com

2 days ago

In reply to: Chao Li (#29)

Re: [Proposal] Adding callback support for custom statistics kinds

On Dec 9, 2025, at 13:23, Chao Li <li.evan.chao@gmail.com> wrote:

On Dec 9, 2025, at 12:45, Michael Paquier <michael@paquier.xyz> wrote:

It seems to me that you are missing nothing here, and that Chao has
missed the fact that the end of pgstat_read_statsfile() does a "goto
done", meaning that we would take a round of

No, I didn’t miss that part. But in the “done” clause:

```
done:
/* First, cleanup the main stats file, PGSTAT_STAT_PERMANENT_FILENAME */
FreeFile(fpin);

elog(DEBUG2, "removing permanent stats file \"%s\"", statfile);
unlink(statfile);

/* Let each stats kind run its cleanup callback, if it provides one */
for (PgStat_Kind kind = PGSTAT_KIND_MIN; kind <= PGSTAT_KIND_MAX; kind++)
{
const PgStat_KindInfo *kind_info = pgstat_get_kind_info(kind);

if (kind_info && kind_info->end_extra_stats)
kind_info->end_extra_stats(STATS_READ);
}
```

end_extra_stats(STATS_READ) has no failure indication.

Sorry, I incidentally clicked “send” too quickly.

My point is that, there are many places jumping to “error”, then from “error” goto “done”, if an error didn’t happen from the deserialize callback, how end_extra_stats() can know if failure happened and takes action accordingly?

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#31

Sami Imseih

samimseih@gmail.com

1 day ago

In reply to: Chao Li (#30)

1 attachment(s)

Re: [Proposal] Adding callback support for custom statistics kinds

My point is that, there are many places jumping to “error”, then from “error” goto “done”,
if an error didn’t happen from the deserialize callback, how end_extra_stats()
can know if failure happened and takes action accordingly?

IIUC, if *any* error occurs outside of a deserialize callback, first the "error"
code will be called, followed by "done" which will then trigger the
end_extra_stats
callback that will perform the cleanup.

Attached is v6 with a few minor indentation fixes and a correction to
freeing the file in the cleanup callback.

--
Sami Imseih
Amazon Web Services (AWS)

#32

Chao Li

li.evan.chao@gmail.com

1 day ago

In reply to: Sami Imseih (#31)

Re: [Proposal] Adding callback support for custom statistics kinds

On Dec 10, 2025, at 05:54, Sami Imseih <samimseih@gmail.com> wrote:

IIUC, if *any* error occurs outside of a deserialize callback, first the "error"
code will be called, followed by "done" which will then trigger the
end_extra_stats
callback that will perform the cleanup.

That is true. But problem is, without an error indication, end_extra_stats(STATS_READ) can only blindly perform cleanup works. As you are providing general purposed callbacks, who knows what scenarios extensions would do, so it’s better to provide more information to callbacks. IMO, letting end_extra_stats() know current situation (normal or failure, even error code) is very meaningful. For example, my extension may want to log “I am forced to quite due to outside error” or “I am done successfully” in end_extra_stats(). Anyway, that’s my own opinion. If you and Michael still consider that’s not a problem, I won’t argue more.

Best reagards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

#33

Sami Imseih

samimseih@gmail.com

1 day ago

In reply to: Chao Li (#32)

Re: [Proposal] Adding callback support for custom statistics kinds

IIUC, if *any* error occurs outside of a deserialize callback, first the

"error"

code will be called, followed by "done" which will then trigger the
end_extra_stats
callback that will perform the cleanup.

That is true. But problem is, without an error indication,
end_extra_stats(STATS_READ) can only blindly perform cleanup works. As you
are providing general purposed callbacks, who knows what scenarios
extensions would do, so it’s better to provide more information to
callbacks. IMO, letting end_extra_stats() know current situation (normal or
failure, even error code) is very meaningful. For example, my extension may
want to log “I am forced to quite due to outside error” or “I am done
successfully” in end_extra_stats(). Anyway, that’s my own opinion. If you
and Michael still consider that’s not a problem, I won’t argue more.

Thanks for explaining. If there is a good use-case to add more detail to
the “end” callback, it’s not very obvious yet. Maybe in the future, there
will be a convincing reason to do so.

When we hit the clean-up code on any “error”, it should be accompanied by
an error log. That is
done in all cases inside pgstat.c, and I expect an extension to log the
error as well.

--
Sami Imseih
Amazon Web Services (AWS)

#34

Michael Paquier

michael@paquier.xyz

about 20 hours ago

In reply to: Sami Imseih (#33)

Re: [Proposal] Adding callback support for custom statistics kinds

On Tue, Dec 09, 2025 at 05:15:45PM -0600, Sami Imseih wrote:

Thanks for explaining. If there is a good use-case to add more detail to
the “end” callback, it’s not very obvious yet. Maybe in the future, there
will be a convincing reason to do so.

The step taken by test_custom_var_stats_file_cleanup() for the
STATS_READ case shines for its simplicity. The STATS_DISCARD case is
also simple: we know that we want to ditch the stats.

Now, it is kind of true that the STATS_WRITE case feels a bit
disturbing written this way: we let a module take an action, but we
don't actually know the state of the main pgstats file when inside the
callback. I mean, you can know how things are going on, but it means
that a module can just rely on a check if
PGSTAT_STAT_PERMANENT_FILENAME is on disk, but an unlink() could have
failed as well. So, yes, I am wondering whether we should do what
Chao is suggesting, passing an extra state to the callback to let the
module know if we have actually succeeded or failed the operations
that have been taken on the main stats file before the callback
end_extra_stats is called in the three cases. It does not matter for
the STATS_READ case, but it may matter for the STATS_DISCARD or
STATS_WRITE case.

When we hit the clean-up code on any “error”, it should be accompanied by
an error log. That is
done in all cases inside pgstat.c, and I expect an extension to log the
error as well.

FWIW, I still have the same question as the one posted here about the
business in pgstat_initialize(), still present in v6:
/messages/by-id/aTepXZ97PsXpuywI@paquier.xyz

This remains unanswered.
--
Michael

#35

Sami Imseih

samimseih@gmail.com

about 9 hours ago

In reply to: Michael Paquier (#34)

Re: [Proposal] Adding callback support for custom statistics kinds

Now, it is kind of true that the STATS_WRITE case feels a bit
disturbing written this way: we let a module take an action, but we
don't actually know the state of the main pgstats file when inside the
callback. I mean, you can know how things are going on, but it means
that a module can just rely on a check if
PGSTAT_STAT_PERMANENT_FILENAME is on disk, but an unlink() could have
failed as well. So, yes, I am wondering whether we should do what
Chao is suggesting, passing an extra state to the callback to let the
module know if we have actually succeeded or failed the operations
that have been taken on the main stats file before the callback
end_extra_stats is called in the three cases. It does not matter for
the STATS_READ case, but it may matter for the STATS_DISCARD or
STATS_WRITE case.

I am having a hard time being convinced that this extra status is needed.
I am not expecting an extension to operate on the main stats file inside
the end_extra_stats callback, and even if some operation failed on the
main stats file, the cleanup callback will need to take the steps to
perform the cleanup on its own resources.

Is there a concrete example?

FWIW, I still have the same question as the one posted here about the
business in pgstat_initialize(), still present in v6:
/messages/by-id/aTepXZ97PsXpuywI@paquier.xyz

This remains unanswered.

Responding to the questions from the thread above.

Why does this part need to run each time a backend initializes its
access to pgstats?

Good point. This is unnecessary. This validation should really be
done inside StatsShmemInit by postmaster.

By the way, checking that to_serialized_extra_stats and
kind_info->from_serialized_extra_stats need to be both defined is
fine as these are coupled together, but I am not following the reason
why end_extra_stats would need to be included in the set? For
example, a stats kind could decide to add some data to the main
pgstats file without creating extra files, hence they may not need to
define end_extra_stats.

.. and after giving this more thought, I actually don't think we should
do any validation for any of the callbacks. If an extension is writing
to any file ( core or custom ), naturally they will want to read it back.
Now I am not sure what these validations are protecting us against.
Also, maybe the extension wants to just read data from the main stats
file, I could see that use-case, perhaps.

So, I am proposing removing the validation altogether. What do
you think?

--
Sami Imseih
Amazon Web Services (AWS)

Participants

Thread Outline

[Proposal] Adding callback support for custom statistics kinds

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: