Add contrib/pg_logicalsnapinspect
Hi hackers,
Please find attached a patch to $SUBJECT.
This module provides SQL functions to inspect the contents of serialized logical
snapshots of a running database cluster, which I think could be useful for
debugging or educational purposes.
It's currently made of 2 functions, one to return the metadata:
postgres=# SELECT * FROM pg_get_logical_snapshot_meta('0/40796E18');
-[ RECORD 1 ]--------
magic | 1369563137
checksum | 1028045905
version | 6
and one to return more information:
postgres=# SELECT * FROM pg_get_logical_snapshot_info('0/40796E18');
-[ RECORD 1 ]------------+-----------
state | 2
xmin | 751
xmax | 751
start_decoding_at | 0/40796AF8
two_phase_at | 0/40796AF8
initial_xmin_horizon | 0
building_full_snapshot | f
in_slot_creation | f
last_serialized_snapshot | 0/0
next_phase_at | 0
committed_count | 0
committed_xip |
catchange_count | 2
catchange_xip | {751,752}
The LSN used as argument is extracted from the snapshot file name:
postgres=# select * from pg_ls_logicalsnapdir();
name | size | modification
-----------------+------+------------------------
0-40796E18.snap | 152 | 2024-08-14 16:36:32+00
(1 row)
A few remarks:
1. The "state" field is linked to the SnapBuildState enum (snapbuild.h). I've the
feeling that that's fine to display it as int but could write an helper function
to display strings instead ('SNAPBUILD_BUILDING_SNAPSHOT',...).
2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.
3. The pg_get_logical_snapshot_info() function mainly displays the SnapBuild
content extracted from the logical snapshot file.
4. I think that providing SQL functions is enough and that it's not needed to
also create a related binary tool.
5. A few PGDLLIMPORT have been added (Windows CI was failing).
6. Related documentation has been added.
7. A test has been added.
8. I don't like the module name that much but it follows the same as for
pg_walinspect.
Looking forward to your feedback,
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v1-0001-Add-contrib-pg_logicalsnapinspect.patchtext/x-diff; charset=us-asciiDownload+800-198
On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
Please find attached a patch to $SUBJECT.
This module provides SQL functions to inspect the contents of serialized logical
snapshots of a running database cluster, which I think could be useful for
debugging or educational purposes.
+1. I see it could be a good debugging aid.
2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.
Is it possible to avoid exposing these structures? Can we expose some
function from snapbuild.c that provides the required information?
--
With Regards,
Amit Kapila.
Hi,
On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote:
On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:Please find attached a patch to $SUBJECT.
This module provides SQL functions to inspect the contents of serialized logical
snapshots of a running database cluster, which I think could be useful for
debugging or educational purposes.+1. I see it could be a good debugging aid.
Thanks for the feedback!
2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.Is it possible to avoid exposing these structures? Can we expose some
function from snapbuild.c that provides the required information?
Yeah, that's an option if we don't want to expose those structs to public.
I think we could 1/ create a function that would return a formed HeapTuple, or
2/ we could create multiple functions (about 15) that would return the values
we are interested in.
I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole
tuple if one is interested to only one value).
What do you think? Did you have something else in mind?
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Wed, Aug 28, 2024 at 1:25 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote:
On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.Is it possible to avoid exposing these structures? Can we expose some
function from snapbuild.c that provides the required information?Yeah, that's an option if we don't want to expose those structs to public.
I think we could 1/ create a function that would return a formed HeapTuple, or
2/ we could create multiple functions (about 15) that would return the values
we are interested in.I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole
tuple if one is interested to only one value).
True, but OTOH, each time we add a new field to these structures, a
new function has to be exposed. I don't have a strong opinion on this
but seeing current use cases, it seems okay to expose a single
function.
What do you think? Did you have something else in mind?
On similar lines, we can also provide a function to get the slot's
on-disk data. IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.
--
With Regards,
Amit Kapila.
Hi,
On Thu, Aug 29, 2024 at 02:51:36PM +0530, Amit Kapila wrote:
On Wed, Aug 28, 2024 at 1:25 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote:
On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.Is it possible to avoid exposing these structures? Can we expose some
function from snapbuild.c that provides the required information?Yeah, that's an option if we don't want to expose those structs to public.
I think we could 1/ create a function that would return a formed HeapTuple, or
2/ we could create multiple functions (about 15) that would return the values
we are interested in.I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole
tuple if one is interested to only one value).True, but OTOH, each time we add a new field to these structures, a
new function has to be exposed. I don't have a strong opinion on this
but seeing current use cases, it seems okay to expose a single
function.
Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?
What do you think? Did you have something else in mind?
On similar lines, we can also provide a function to get the slot's
on-disk data.
Yeah, having a way to expose the data from the disk makes fully sense to me.
IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.
That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?What do you think? Did you have something else in mind?
On similar lines, we can also provide a function to get the slot's
on-disk data.Yeah, having a way to expose the data from the disk makes fully sense to me.
IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).
+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.
FWIW, I wrote pg_replslotdata as a tool, not as an extension for
reading on-disk replication slot data to help when the server is down
- /messages/by-id/CALj2ACW0rV5gWK8A3m6_X62qH+Vfaq5hznC=i0R5Wojt5+yhyw@mail.gmail.com.
When the server is running, pg_get_replication_slots() pretty much
gives the on-disk contents.
--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Hi,
On Thu, Aug 29, 2024 at 06:33:19PM +0530, Bharath Rupireddy wrote:
On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.
Thanks for the feedback!
I don't see any use case where it could be useful when the server is down. So,
I think I'll move forward with in core functions (unless someone has a different
opinion).
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Hi,
On Thu, Aug 29, 2024 at 02:15:47PM +0000, Bertrand Drouvot wrote:
I don't see any use case where it could be useful when the server is down. So,
I think I'll move forward with in core functions (unless someone has a different
opinion).
Please find v2 attached that creates the 2 new in core functions.
Note that once those new functions are in (or maybe sooner), I'll submit an
additional patch to get rid of the code duplication between the new
ValidateSnapshotFile() and SnapBuildRestore().
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v2-0001-Functions-to-get-ondisk-logical-snapshots-details.patchtext/x-diff; charset=us-asciiDownload+385-2
On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?What do you think? Did you have something else in mind?
On similar lines, we can also provide a function to get the slot's
on-disk data.Yeah, having a way to expose the data from the disk makes fully sense to me.
IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.
We have an example of pageinspect which provides low-level functions
to aid debugging. The proposal for these APIs also seems to fall in
the same category, so why go for the core for these functions?
--
With Regards,
Amit Kapila.
Hi,
On Fri, Aug 30, 2024 at 03:43:12PM +0530, Amit Kapila wrote:
On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?What do you think? Did you have something else in mind?
On similar lines, we can also provide a function to get the slot's
on-disk data.Yeah, having a way to expose the data from the disk makes fully sense to me.
IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.We have an example of pageinspect which provides low-level functions
to aid debugging. The proposal for these APIs also seems to fall in
the same category,
That's right, but...
so why go for the core for these functions?
as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
On Fri, Aug 30, 2024 at 03:43:12PM +0530, Amit Kapila wrote:
On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?What do you think? Did you have something else in mind?
On similar lines, we can also provide a function to get the slot's
on-disk data.Yeah, having a way to expose the data from the disk makes fully sense to me.
IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.We have an example of pageinspect which provides low-level functions
to aid debugging. The proposal for these APIs also seems to fall in
the same category,That's right, but...
so why go for the core for these functions?
as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.
I see your point. Does anyone else have an opinion on the need for
these functions and whether to expose them from a contrib module or
have them as core functions?
--
With Regards,
Amit Kapila.
Hi,
On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote:
On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.I see your point. Does anyone else have an opinion on the need for
these functions and whether to expose them from a contrib module or
have them as core functions?
I looked at when the SNAPBUILD_VERSION has been changed:
ec5896aed3 (2014)
a975ff4980 (2021)
8bdb1332eb (2021)
7f13ac8123 (2022)
bb19b70081 (2024)
So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that
frequently. Furthermore, those structs are serialized and so we have to preserve
their on-disk compatibility (means we can change them only in a major release
if we need to).
So, I think it would not be that much of an issue to expose those structs and
create a new contrib module (as v1 did propose) instead of in core new functions.
If we want to insist that external modules "should" not rely on those structs then
we could put them into a new internal_snapbuild.h file (instead of snapbuild.h
as proposed in v1).
At the end, I think that creating a contrib module and exposing those structs in
internal_snapbuild.h make more sense (as compared to in core functions).
Thoughts?
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
On Tue, Sep 10, 2024 at 8:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote:
On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.I see your point. Does anyone else have an opinion on the need for
these functions and whether to expose them from a contrib module or
have them as core functions?I looked at when the SNAPBUILD_VERSION has been changed:
ec5896aed3 (2014)
a975ff4980 (2021)
8bdb1332eb (2021)
7f13ac8123 (2022)
bb19b70081 (2024)So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that
frequently. Furthermore, those structs are serialized and so we have to preserve
their on-disk compatibility (means we can change them only in a major release
if we need to).So, I think it would not be that much of an issue to expose those structs and
create a new contrib module (as v1 did propose) instead of in core new functions.If we want to insist that external modules "should" not rely on those structs then
we could put them into a new internal_snapbuild.h file (instead of snapbuild.h
as proposed in v1).
Adding snapbuild_internal.h sounds like a good idea.
At the end, I think that creating a contrib module and exposing those structs in
internal_snapbuild.h make more sense (as compared to in core functions).
Fail enough. We can keep the module name as logicalinspect so that we
can extend it for other logical decoding/replication-related files in
the future.
--
With Regards,
Amit Kapila.
Hi,
On Wed, Sep 11, 2024 at 10:30:37AM +0530, Amit Kapila wrote:
On Tue, Sep 10, 2024 at 8:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote:
On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.I see your point. Does anyone else have an opinion on the need for
these functions and whether to expose them from a contrib module or
have them as core functions?I looked at when the SNAPBUILD_VERSION has been changed:
ec5896aed3 (2014)
a975ff4980 (2021)
8bdb1332eb (2021)
7f13ac8123 (2022)
bb19b70081 (2024)So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that
frequently. Furthermore, those structs are serialized and so we have to preserve
their on-disk compatibility (means we can change them only in a major release
if we need to).So, I think it would not be that much of an issue to expose those structs and
create a new contrib module (as v1 did propose) instead of in core new functions.If we want to insist that external modules "should" not rely on those structs then
we could put them into a new internal_snapbuild.h file (instead of snapbuild.h
as proposed in v1).Adding snapbuild_internal.h sounds like a good idea.
Thanks for the feedback!
At the end, I think that creating a contrib module and exposing those structs in
internal_snapbuild.h make more sense (as compared to in core functions).Fail enough. We can keep the module name as logicalinspect so that we
can extend it for other logical decoding/replication-related files in
the future.
Yeah, good idea. Done that way in v3 attached.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v3-0001-Add-contrib-pg_logicalinspect.patchtext/x-diff; charset=us-asciiDownload+826-194
On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
Yeah, good idea. Done that way in v3 attached.
Thanks for the patch. +1 on the patch's idea. I have started
reviewing/testing it. It is WIP but please find few initial comments:
src/backend/replication/logical/snapbuild.c:
1)
+ fsync_fname("pg_logical/snapshots", true);
Should we use macros PG_LOGICAL_DIR and PG_LOGICAL_SNAPSHOTS_DIR in
ValidateSnapshotFile(), instead of hard coding the path
2)
Same as above in pg_get_logical_snapshot_meta() and
pg_get_logical_snapshot_info()
+ sprintf(path, "pg_logical/snapshots/%X-%X.snap",
+ LSN_FORMAT_ARGS(lsn)); LSN_FORMAT_ARGS(lsn));
3)
+#include "replication/internal_snapbuild.h"
Shall we name new file as 'snapbuild_internal.h' instead of
'internal_snapbuild.h'. Please see other files' name under
'./src/include/replication':
worker_internal.h
walsender_private.h
4)
+static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk,
+ const char *path);
Is it required? We generally don't add declaration unless required by
compiler. Since definition is prior to usage, it is not needed?
thanks
Shveta
Hi,
On Mon, Sep 16, 2024 at 04:02:51PM +0530, shveta malik wrote:
On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:Yeah, good idea. Done that way in v3 attached.
Thanks for the patch. +1 on the patch's idea. I have started
reviewing/testing it. It is WIP but please find few initial comments:
Thanks for sharing your thoughts and for the review!
src/backend/replication/logical/snapbuild.c:
1)
+ fsync_fname("pg_logical/snapshots", true);Should we use macros PG_LOGICAL_DIR and PG_LOGICAL_SNAPSHOTS_DIR in
ValidateSnapshotFile(), instead of hard coding the path2)
Same as above in pg_get_logical_snapshot_meta() and
pg_get_logical_snapshot_info()+ sprintf(path, "pg_logical/snapshots/%X-%X.snap", + LSN_FORMAT_ARGS(lsn)); LSN_FORMAT_ARGS(lsn));
Doh! Yeah, agree that we should use those macros. They are coming from c39afc38cf
which has been introduced after v1 of this patch. I thought I took care of it once
c39afc38cf went in, but it looks like I missed it somehow. Done in v4 attached,
Thanks!
3)
+#include "replication/internal_snapbuild.h"Shall we name new file as 'snapbuild_internal.h' instead of
'internal_snapbuild.h'. Please see other files' name under
'./src/include/replication':
worker_internal.h
walsender_private.h
Agree, that should be snapbuild_internal.h, done in v4.
4) +static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk, + const char *path);Is it required? We generally don't add declaration unless required by
compiler. Since definition is prior to usage, it is not needed?
I personally prefer to add them even if not required by the compiler. I did not
pay attention that "We generally don't add declaration unless required by compiler"
and (after a quick check) I did not find any reference in the coding style
documentation [1]https://www.postgresql.org/docs/current/source.html. That said, I don't have a strong opinion about that and so
removed in v4. Worth to add a mention in the coding convention doc?
[1]: https://www.postgresql.org/docs/current/source.html
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachments:
v4-0001-Add-contrib-pg_logicalinspect.patchtext/x-diff; charset=us-asciiDownload+825-194
On Mon, Sep 16, 2024 at 8:03 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:
Hi,
On Mon, Sep 16, 2024 at 04:02:51PM +0530, shveta malik wrote:
On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:Yeah, good idea. Done that way in v3 attached.
Thanks for the patch. +1 on the patch's idea. I have started
reviewing/testing it. It is WIP but please find few initial comments:Thanks for sharing your thoughts and for the review!
src/backend/replication/logical/snapbuild.c:
1)
+ fsync_fname("pg_logical/snapshots", true);Should we use macros PG_LOGICAL_DIR and PG_LOGICAL_SNAPSHOTS_DIR in
ValidateSnapshotFile(), instead of hard coding the path2)
Same as above in pg_get_logical_snapshot_meta() and
pg_get_logical_snapshot_info()+ sprintf(path, "pg_logical/snapshots/%X-%X.snap", + LSN_FORMAT_ARGS(lsn)); LSN_FORMAT_ARGS(lsn));Doh! Yeah, agree that we should use those macros. They are coming from c39afc38cf
which has been introduced after v1 of this patch. I thought I took care of it once
c39afc38cf went in, but it looks like I missed it somehow. Done in v4 attached,
Thanks!3)
+#include "replication/internal_snapbuild.h"Shall we name new file as 'snapbuild_internal.h' instead of
'internal_snapbuild.h'. Please see other files' name under
'./src/include/replication':
worker_internal.h
walsender_private.hAgree, that should be snapbuild_internal.h, done in v4.
4) +static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk, + const char *path);Is it required? We generally don't add declaration unless required by
compiler. Since definition is prior to usage, it is not needed?I personally prefer to add them even if not required by the compiler. I did not
pay attention that "We generally don't add declaration unless required by compiler"
and (after a quick check) I did not find any reference in the coding style
documentation [1]. That said, I don't have a strong opinion about that and so
removed in v4. Worth to add a mention in the coding convention doc?
Okay. I was somehow under the impression that this is the way in the
postgres i.e. not add redundant declarations. Will be good to know
what others think on this.
Thanks for addressing the comments. I have not started reviewing v4
yet, but here are few more comments on v3:
1)
+#include "port/pg_crc32c.h"
It is not needed in pg_logicalinspect.c as it is already included in
internal_snapbuild.h
2)
+ values[0] = Int16GetDatum(ondisk.builder.state);
........
+ values[8] = LSNGetDatum(ondisk.builder.last_serialized_snapshot);
+ values[9] = TransactionIdGetDatum(ondisk.builder.next_phase_at);
+ values[10] = Int64GetDatum(ondisk.builder.committed.xcnt);
We can have values[i++] in all the places and later we can check :
Assert(i == PG_GET_LOGICAL_SNAPSHOT_INFO_COLS);
Then we need not to keep track of number even in later part of code,
as it goes till 14.
3)
Similar change can be done here:
+ values[0] = Int32GetDatum(ondisk.magic);
+ values[1] = Int32GetDatum(ondisk.checksum);
+ values[2] = Int32GetDatum(ondisk.version);
check at the end will be: Assert(i == PG_GET_LOGICAL_SNAPSHOT_META_COLS);
4)
Most of the output columns in pg_get_logical_snapshot_info() look
self-explanatory except 'state'. Should we have meaningful 'text' here
corresponding to SnapBuildState? Similar to what we do for
'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses
for ReplicationSlotInvalidationCause)
thanks
Shveta
On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote:
Thanks for addressing the comments. I have not started reviewing v4
yet, but here are few more comments on v3:
I just noticed that when we pass NULL input, both the new functions
give 1 row as output, all cols as NULL:
newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
magic | checksum | version
-------+----------+---------
| |
(1 row)
Similar behavior with pg_get_logical_snapshot_info(). While the
existing 'pg_ls_logicalsnapdir' function gives this error, which looks
more meaningful:
newdb1=# select * from pg_ls_logicalsnapdir(NULL);
ERROR: function pg_ls_logicalsnapdir(unknown) does not exist
LINE 1: select * from pg_ls_logicalsnapdir(NULL);
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.
Shouldn't the new functions have same behavior?
thanks
Shveta
On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote:
On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com>
wrote:Thanks for addressing the comments. I have not started reviewing v4
yet, but here are few more comments on v3:I just noticed that when we pass NULL input, both the new functions
give 1 row as output, all cols as NULL:newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
magic | checksum | version
-------+----------+---------
| |(1 row)
Similar behavior with pg_get_logical_snapshot_info(). While the
existing 'pg_ls_logicalsnapdir' function gives this error, which looks
more meaningful:newdb1=# select * from pg_ls_logicalsnapdir(NULL);
ERROR: function pg_ls_logicalsnapdir(unknown) does not exist
LINE 1: select * from pg_ls_logicalsnapdir(NULL);
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.Shouldn't the new functions have same behavior?
No. Since the name pg_ls_logicalsnapdir has zero single-argument
implementations passing a null value as an argument is indeed attempt to
invoke a function signature that doesn’t exist.
If there is exactly one single input argument function of the given name
the parser is going to cast the null literal to the data type of the single
argument and invoke the function. It will not and cannot be convinced to
fail to find a matching function.
I can see an argument that they should produce an empty set instead of a
single all-null row, but the idea that they wouldn’t even be found is
contrary to a core design of the system.
David J.
Hi,
On Mon, Sep 16, 2024 at 10:16:16PM -0700, David G. Johnston wrote:
On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote:
On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com>
wrote:Thanks for addressing the comments. I have not started reviewing v4
yet, but here are few more comments on v3:I just noticed that when we pass NULL input, both the new functions
give 1 row as output, all cols as NULL:newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
magic | checksum | version
-------+----------+---------
| |(1 row)
Similar behavior with pg_get_logical_snapshot_info(). While the
existing 'pg_ls_logicalsnapdir' function gives this error, which looks
more meaningful:newdb1=# select * from pg_ls_logicalsnapdir(NULL);
ERROR: function pg_ls_logicalsnapdir(unknown) does not exist
LINE 1: select * from pg_ls_logicalsnapdir(NULL);
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.Shouldn't the new functions have same behavior?
No. Since the name pg_ls_logicalsnapdir has zero single-argument
implementations passing a null value as an argument is indeed attempt to
invoke a function signature that doesn’t exist.
Agree.
I can see an argument that they should produce an empty set instead of a
single all-null row,
Yeah, it's outside the scope of this patch but I've seen different behavior
in this area.
For example:
postgres=# select * from pg_ls_replslotdir(NULL);
name | size | modification
------+------+--------------
(0 rows)
as compared to:
postgres=# select * from pg_walfile_name_offset(NULL);
file_name | file_offset
-----------+-------------
|
(1 row)
I thought that it might be linked to the volatility but it is not:
postgres=# select * from pg_stat_get_xact_blocks_fetched(NULL);
pg_stat_get_xact_blocks_fetched
---------------------------------
(1 row)
postgres=# select * from pg_get_multixact_members(NULL);
xid | mode
-----+------
(0 rows)
while both are volatile.
I think both make sense: It's "empty" or we "don't know the values of the fields".
I don't know if there is any reason about this "inconsistency".
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com