Add contrib/pg_logicalsnapinspect

Started by Bertrand Drouvotover 1 year ago81 messages
Jump to latest
#1Bertrand Drouvot
bertranddrouvot.pg@gmail.com

Hi hackers,

Please find attached a patch to $SUBJECT.

This module provides SQL functions to inspect the contents of serialized logical
snapshots of a running database cluster, which I think could be useful for
debugging or educational purposes.

It's currently made of 2 functions, one to return the metadata:

postgres=# SELECT * FROM pg_get_logical_snapshot_meta('0/40796E18');
-[ RECORD 1 ]--------
magic | 1369563137
checksum | 1028045905
version | 6

and one to return more information:

postgres=# SELECT * FROM pg_get_logical_snapshot_info('0/40796E18');
-[ RECORD 1 ]------------+-----------
state | 2
xmin | 751
xmax | 751
start_decoding_at | 0/40796AF8
two_phase_at | 0/40796AF8
initial_xmin_horizon | 0
building_full_snapshot | f
in_slot_creation | f
last_serialized_snapshot | 0/0
next_phase_at | 0
committed_count | 0
committed_xip |
catchange_count | 2
catchange_xip | {751,752}

The LSN used as argument is extracted from the snapshot file name:

postgres=# select * from pg_ls_logicalsnapdir();
name | size | modification
-----------------+------+------------------------
0-40796E18.snap | 152 | 2024-08-14 16:36:32+00
(1 row)

A few remarks:

1. The "state" field is linked to the SnapBuildState enum (snapbuild.h). I've the
feeling that that's fine to display it as int but could write an helper function
to display strings instead ('SNAPBUILD_BUILDING_SNAPSHOT',...).

2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.

3. The pg_get_logical_snapshot_info() function mainly displays the SnapBuild
content extracted from the logical snapshot file.

4. I think that providing SQL functions is enough and that it's not needed to
also create a related binary tool.

5. A few PGDLLIMPORT have been added (Windows CI was failing).

6. Related documentation has been added.

7. A test has been added.

8. I don't like the module name that much but it follows the same as for
pg_walinspect.

Looking forward to your feedback,

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v1-0001-Add-contrib-pg_logicalsnapinspect.patchtext/x-diff; charset=us-asciiDownload+800-198
#2Amit Kapila
amit.kapila16@gmail.com
In reply to: Bertrand Drouvot (#1)
Re: Add contrib/pg_logicalsnapinspect

On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find attached a patch to $SUBJECT.

This module provides SQL functions to inspect the contents of serialized logical
snapshots of a running database cluster, which I think could be useful for
debugging or educational purposes.

+1. I see it could be a good debugging aid.

2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.

Is it possible to avoid exposing these structures? Can we expose some
function from snapbuild.c that provides the required information?

--
With Regards,
Amit Kapila.

#3Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#2)
Re: Add contrib/pg_logicalsnapinspect

Hi,

On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote:

On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Please find attached a patch to $SUBJECT.

This module provides SQL functions to inspect the contents of serialized logical
snapshots of a running database cluster, which I think could be useful for
debugging or educational purposes.

+1. I see it could be a good debugging aid.

Thanks for the feedback!

2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.

Is it possible to avoid exposing these structures? Can we expose some
function from snapbuild.c that provides the required information?

Yeah, that's an option if we don't want to expose those structs to public.

I think we could 1/ create a function that would return a formed HeapTuple, or
2/ we could create multiple functions (about 15) that would return the values
we are interested in.

I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole
tuple if one is interested to only one value).

What do you think? Did you have something else in mind?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#4Amit Kapila
amit.kapila16@gmail.com
In reply to: Bertrand Drouvot (#3)
Re: Add contrib/pg_logicalsnapinspect

On Wed, Aug 28, 2024 at 1:25 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote:

On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.

Is it possible to avoid exposing these structures? Can we expose some
function from snapbuild.c that provides the required information?

Yeah, that's an option if we don't want to expose those structs to public.

I think we could 1/ create a function that would return a formed HeapTuple, or
2/ we could create multiple functions (about 15) that would return the values
we are interested in.

I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole
tuple if one is interested to only one value).

True, but OTOH, each time we add a new field to these structures, a
new function has to be exposed. I don't have a strong opinion on this
but seeing current use cases, it seems okay to expose a single
function.

What do you think? Did you have something else in mind?

On similar lines, we can also provide a function to get the slot's
on-disk data. IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.

--
With Regards,
Amit Kapila.

#5Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#4)
Re: Add contrib/pg_logicalsnapinspect

Hi,

On Thu, Aug 29, 2024 at 02:51:36PM +0530, Amit Kapila wrote:

On Wed, Aug 28, 2024 at 1:25 AM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Mon, Aug 26, 2024 at 07:05:27PM +0530, Amit Kapila wrote:

On Thu, Aug 22, 2024 at 5:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

2. The SnapBuildOnDisk and SnapBuild structs are now exposed to public. Means
we should now pay much more attention when changing their contents but I think
it's worth it.

Is it possible to avoid exposing these structures? Can we expose some
function from snapbuild.c that provides the required information?

Yeah, that's an option if we don't want to expose those structs to public.

I think we could 1/ create a function that would return a formed HeapTuple, or
2/ we could create multiple functions (about 15) that would return the values
we are interested in.

I think 2/ is fine as it would give more flexiblity (no need to retrieve a whole
tuple if one is interested to only one value).

True, but OTOH, each time we add a new field to these structures, a
new function has to be exposed. I don't have a strong opinion on this
but seeing current use cases, it seems okay to expose a single
function.

Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?

What do you think? Did you have something else in mind?

On similar lines, we can also provide a function to get the slot's
on-disk data.

Yeah, having a way to expose the data from the disk makes fully sense to me.

IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.

That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#6Bharath Rupireddy
bharath.rupireddyforpostgres@gmail.com
In reply to: Bertrand Drouvot (#5)
Re: Add contrib/pg_logicalsnapinspect

On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?

What do you think? Did you have something else in mind?

On similar lines, we can also provide a function to get the slot's
on-disk data.

Yeah, having a way to expose the data from the disk makes fully sense to me.

IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.

That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).

+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.

FWIW, I wrote pg_replslotdata as a tool, not as an extension for
reading on-disk replication slot data to help when the server is down
- /messages/by-id/CALj2ACW0rV5gWK8A3m6_X62qH+Vfaq5hznC=i0R5Wojt5+yhyw@mail.gmail.com.
When the server is running, pg_get_replication_slots() pretty much
gives the on-disk contents.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#7Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Bharath Rupireddy (#6)
Re: Add contrib/pg_logicalsnapinspect

Hi,

On Thu, Aug 29, 2024 at 06:33:19PM +0530, Bharath Rupireddy wrote:

On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).

+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.

Thanks for the feedback!

I don't see any use case where it could be useful when the server is down. So,
I think I'll move forward with in core functions (unless someone has a different
opinion).

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#8Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Bertrand Drouvot (#7)
Re: Add contrib/pg_logicalsnapinspect

Hi,

On Thu, Aug 29, 2024 at 02:15:47PM +0000, Bertrand Drouvot wrote:

I don't see any use case where it could be useful when the server is down. So,
I think I'll move forward with in core functions (unless someone has a different
opinion).

Please find v2 attached that creates the 2 new in core functions.

Note that once those new functions are in (or maybe sooner), I'll submit an
additional patch to get rid of the code duplication between the new
ValidateSnapshotFile() and SnapBuildRestore().

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v2-0001-Functions-to-get-ondisk-logical-snapshots-details.patchtext/x-diff; charset=us-asciiDownload+385-2
#9Amit Kapila
amit.kapila16@gmail.com
In reply to: Bharath Rupireddy (#6)
Re: Add contrib/pg_logicalsnapinspect

On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?

What do you think? Did you have something else in mind?

On similar lines, we can also provide a function to get the slot's
on-disk data.

Yeah, having a way to expose the data from the disk makes fully sense to me.

IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.

That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).

+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.

We have an example of pageinspect which provides low-level functions
to aid debugging. The proposal for these APIs also seems to fall in
the same category, so why go for the core for these functions?

--
With Regards,
Amit Kapila.

#10Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#9)
Re: Add contrib/pg_logicalsnapinspect

Hi,

On Fri, Aug 30, 2024 at 03:43:12PM +0530, Amit Kapila wrote:

On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?

What do you think? Did you have something else in mind?

On similar lines, we can also provide a function to get the slot's
on-disk data.

Yeah, having a way to expose the data from the disk makes fully sense to me.

IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.

That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).

+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.

We have an example of pageinspect which provides low-level functions
to aid debugging. The proposal for these APIs also seems to fall in
the same category,

That's right, but...

so why go for the core for these functions?

as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#11Amit Kapila
amit.kapila16@gmail.com
In reply to: Bertrand Drouvot (#10)
Re: Add contrib/pg_logicalsnapinspect

On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Fri, Aug 30, 2024 at 03:43:12PM +0530, Amit Kapila wrote:

On Thu, Aug 29, 2024 at 6:33 PM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:

On Thu, Aug 29, 2024 at 3:44 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Yeah that's fair. And now I'm wondering if we need an extra module. I think
we could "simply" expose 2 new functions in core, thoughts?

What do you think? Did you have something else in mind?

On similar lines, we can also provide a function to get the slot's
on-disk data.

Yeah, having a way to expose the data from the disk makes fully sense to me.

IIRC, Bharath had previously proposed a tool to achieve
the same. It is fine if we don't want to add that as part of this
patch but I mentioned it because by having that we can have a set of
functions to view logical decoding data.

That's right. I think this one would be simply enough to expose one or two
functions in core too (and probably would not need an extra module).

+1 for functions in core unless this extra module
pg_logicalsnapinspect works as a tool to be helpful even when the
server is down.

We have an example of pageinspect which provides low-level functions
to aid debugging. The proposal for these APIs also seems to fall in
the same category,

That's right, but...

so why go for the core for these functions?

as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.

I see your point. Does anyone else have an opinion on the need for
these functions and whether to expose them from a contrib module or
have them as core functions?

--
With Regards,
Amit Kapila.

#12Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#11)
Re: Add contrib/pg_logicalsnapinspect

Hi,

On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote:

On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.

I see your point. Does anyone else have an opinion on the need for
these functions and whether to expose them from a contrib module or
have them as core functions?

I looked at when the SNAPBUILD_VERSION has been changed:

ec5896aed3 (2014)
a975ff4980 (2021)
8bdb1332eb (2021)
7f13ac8123 (2022)
bb19b70081 (2024)

So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that
frequently. Furthermore, those structs are serialized and so we have to preserve
their on-disk compatibility (means we can change them only in a major release
if we need to).

So, I think it would not be that much of an issue to expose those structs and
create a new contrib module (as v1 did propose) instead of in core new functions.

If we want to insist that external modules "should" not rely on those structs then
we could put them into a new internal_snapbuild.h file (instead of snapbuild.h
as proposed in v1).

At the end, I think that creating a contrib module and exposing those structs in
internal_snapbuild.h make more sense (as compared to in core functions).

Thoughts?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#13Amit Kapila
amit.kapila16@gmail.com
In reply to: Bertrand Drouvot (#12)
Re: Add contrib/pg_logicalsnapinspect

On Tue, Sep 10, 2024 at 8:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote:

On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.

I see your point. Does anyone else have an opinion on the need for
these functions and whether to expose them from a contrib module or
have them as core functions?

I looked at when the SNAPBUILD_VERSION has been changed:

ec5896aed3 (2014)
a975ff4980 (2021)
8bdb1332eb (2021)
7f13ac8123 (2022)
bb19b70081 (2024)

So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that
frequently. Furthermore, those structs are serialized and so we have to preserve
their on-disk compatibility (means we can change them only in a major release
if we need to).

So, I think it would not be that much of an issue to expose those structs and
create a new contrib module (as v1 did propose) instead of in core new functions.

If we want to insist that external modules "should" not rely on those structs then
we could put them into a new internal_snapbuild.h file (instead of snapbuild.h
as proposed in v1).

Adding snapbuild_internal.h sounds like a good idea.

At the end, I think that creating a contrib module and exposing those structs in
internal_snapbuild.h make more sense (as compared to in core functions).

Fail enough. We can keep the module name as logicalinspect so that we
can extend it for other logical decoding/replication-related files in
the future.

--
With Regards,
Amit Kapila.

#14Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#13)
Re: Add contrib/pg_logicalsnapinspect

Hi,

On Wed, Sep 11, 2024 at 10:30:37AM +0530, Amit Kapila wrote:

On Tue, Sep 10, 2024 at 8:56 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

On Mon, Sep 09, 2024 at 04:24:09PM +0530, Amit Kapila wrote:

On Fri, Aug 30, 2024 at 5:18 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

as we decided not to expose the SnapBuildOnDisk and SnapBuild structs to public
and to create/expose 2 new functions in snapbuild.c then the functions in the
module would do nothing but expose the data coming from the snapbuild.c's
functions (get the tuple and send it to the client). That sounds weird to me to
create a module that would "only" do so, that's why I thought that in core
functions taking care of everything make more sense.

I see your point. Does anyone else have an opinion on the need for
these functions and whether to expose them from a contrib module or
have them as core functions?

I looked at when the SNAPBUILD_VERSION has been changed:

ec5896aed3 (2014)
a975ff4980 (2021)
8bdb1332eb (2021)
7f13ac8123 (2022)
bb19b70081 (2024)

So it's not like we are changing the SnapBuildOnDisk or SnapBuild structs that
frequently. Furthermore, those structs are serialized and so we have to preserve
their on-disk compatibility (means we can change them only in a major release
if we need to).

So, I think it would not be that much of an issue to expose those structs and
create a new contrib module (as v1 did propose) instead of in core new functions.

If we want to insist that external modules "should" not rely on those structs then
we could put them into a new internal_snapbuild.h file (instead of snapbuild.h
as proposed in v1).

Adding snapbuild_internal.h sounds like a good idea.

Thanks for the feedback!

At the end, I think that creating a contrib module and exposing those structs in
internal_snapbuild.h make more sense (as compared to in core functions).

Fail enough. We can keep the module name as logicalinspect so that we
can extend it for other logical decoding/replication-related files in
the future.

Yeah, good idea. Done that way in v3 attached.

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v3-0001-Add-contrib-pg_logicalinspect.patchtext/x-diff; charset=us-asciiDownload+826-194
#15shveta malik
shveta.malik@gmail.com
In reply to: Bertrand Drouvot (#14)
Re: Add contrib/pg_logicalsnapinspect

On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Yeah, good idea. Done that way in v3 attached.

Thanks for the patch. +1 on the patch's idea. I have started
reviewing/testing it. It is WIP but please find few initial comments:

src/backend/replication/logical/snapbuild.c:

1)
+ fsync_fname("pg_logical/snapshots", true);

Should we use macros PG_LOGICAL_DIR and PG_LOGICAL_SNAPSHOTS_DIR in
ValidateSnapshotFile(), instead of hard coding the path

2)
Same as above in pg_get_logical_snapshot_meta() and
pg_get_logical_snapshot_info()

+ sprintf(path, "pg_logical/snapshots/%X-%X.snap",
+ LSN_FORMAT_ARGS(lsn));                        LSN_FORMAT_ARGS(lsn));

3)
+#include "replication/internal_snapbuild.h"

Shall we name new file as 'snapbuild_internal.h' instead of
'internal_snapbuild.h'. Please see other files' name under
'./src/include/replication':
worker_internal.h
walsender_private.h

4)
+static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk,
+ const char *path);

Is it required? We generally don't add declaration unless required by
compiler. Since definition is prior to usage, it is not needed?

thanks
Shveta

#16Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: shveta malik (#15)
Re: Add contrib/pg_logicalsnapinspect

Hi,

On Mon, Sep 16, 2024 at 04:02:51PM +0530, shveta malik wrote:

On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Yeah, good idea. Done that way in v3 attached.

Thanks for the patch. +1 on the patch's idea. I have started
reviewing/testing it. It is WIP but please find few initial comments:

Thanks for sharing your thoughts and for the review!

src/backend/replication/logical/snapbuild.c:

1)
+ fsync_fname("pg_logical/snapshots", true);

Should we use macros PG_LOGICAL_DIR and PG_LOGICAL_SNAPSHOTS_DIR in
ValidateSnapshotFile(), instead of hard coding the path

2)
Same as above in pg_get_logical_snapshot_meta() and
pg_get_logical_snapshot_info()

+ sprintf(path, "pg_logical/snapshots/%X-%X.snap",
+ LSN_FORMAT_ARGS(lsn));                        LSN_FORMAT_ARGS(lsn));

Doh! Yeah, agree that we should use those macros. They are coming from c39afc38cf
which has been introduced after v1 of this patch. I thought I took care of it once
c39afc38cf went in, but it looks like I missed it somehow. Done in v4 attached,
Thanks!

3)
+#include "replication/internal_snapbuild.h"

Shall we name new file as 'snapbuild_internal.h' instead of
'internal_snapbuild.h'. Please see other files' name under
'./src/include/replication':
worker_internal.h
walsender_private.h

Agree, that should be snapbuild_internal.h, done in v4.

4)
+static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk,
+ const char *path);

Is it required? We generally don't add declaration unless required by
compiler. Since definition is prior to usage, it is not needed?

I personally prefer to add them even if not required by the compiler. I did not
pay attention that "We generally don't add declaration unless required by compiler"
and (after a quick check) I did not find any reference in the coding style
documentation [1]https://www.postgresql.org/docs/current/source.html. That said, I don't have a strong opinion about that and so
removed in v4. Worth to add a mention in the coding convention doc?

[1]: https://www.postgresql.org/docs/current/source.html

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachments:

v4-0001-Add-contrib-pg_logicalinspect.patchtext/x-diff; charset=us-asciiDownload+825-194
#17shveta malik
shveta.malik@gmail.com
In reply to: Bertrand Drouvot (#16)
Re: Add contrib/pg_logicalsnapinspect

On Mon, Sep 16, 2024 at 8:03 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Hi,

On Mon, Sep 16, 2024 at 04:02:51PM +0530, shveta malik wrote:

On Wed, Sep 11, 2024 at 4:21 PM Bertrand Drouvot
<bertranddrouvot.pg@gmail.com> wrote:

Yeah, good idea. Done that way in v3 attached.

Thanks for the patch. +1 on the patch's idea. I have started
reviewing/testing it. It is WIP but please find few initial comments:

Thanks for sharing your thoughts and for the review!

src/backend/replication/logical/snapbuild.c:

1)
+ fsync_fname("pg_logical/snapshots", true);

Should we use macros PG_LOGICAL_DIR and PG_LOGICAL_SNAPSHOTS_DIR in
ValidateSnapshotFile(), instead of hard coding the path

2)
Same as above in pg_get_logical_snapshot_meta() and
pg_get_logical_snapshot_info()

+ sprintf(path, "pg_logical/snapshots/%X-%X.snap",
+ LSN_FORMAT_ARGS(lsn));                        LSN_FORMAT_ARGS(lsn));

Doh! Yeah, agree that we should use those macros. They are coming from c39afc38cf
which has been introduced after v1 of this patch. I thought I took care of it once
c39afc38cf went in, but it looks like I missed it somehow. Done in v4 attached,
Thanks!

3)
+#include "replication/internal_snapbuild.h"

Shall we name new file as 'snapbuild_internal.h' instead of
'internal_snapbuild.h'. Please see other files' name under
'./src/include/replication':
worker_internal.h
walsender_private.h

Agree, that should be snapbuild_internal.h, done in v4.

4)
+static void ValidateSnapshotFile(XLogRecPtr lsn, SnapBuildOnDisk *ondisk,
+ const char *path);

Is it required? We generally don't add declaration unless required by
compiler. Since definition is prior to usage, it is not needed?

I personally prefer to add them even if not required by the compiler. I did not
pay attention that "We generally don't add declaration unless required by compiler"
and (after a quick check) I did not find any reference in the coding style
documentation [1]. That said, I don't have a strong opinion about that and so
removed in v4. Worth to add a mention in the coding convention doc?

Okay. I was somehow under the impression that this is the way in the
postgres i.e. not add redundant declarations. Will be good to know
what others think on this.

Thanks for addressing the comments. I have not started reviewing v4
yet, but here are few more comments on v3:

1)
+#include "port/pg_crc32c.h"

It is not needed in pg_logicalinspect.c as it is already included in
internal_snapbuild.h

2)
+ values[0] = Int16GetDatum(ondisk.builder.state);
........
+ values[8] = LSNGetDatum(ondisk.builder.last_serialized_snapshot);
+ values[9] = TransactionIdGetDatum(ondisk.builder.next_phase_at);
+ values[10] = Int64GetDatum(ondisk.builder.committed.xcnt);

We can have values[i++] in all the places and later we can check :
Assert(i == PG_GET_LOGICAL_SNAPSHOT_INFO_COLS);
Then we need not to keep track of number even in later part of code,
as it goes till 14.

3)
Similar change can be done here:

+ values[0] = Int32GetDatum(ondisk.magic);
+ values[1] = Int32GetDatum(ondisk.checksum);
+ values[2] = Int32GetDatum(ondisk.version);

check at the end will be: Assert(i == PG_GET_LOGICAL_SNAPSHOT_META_COLS);

4)
Most of the output columns in pg_get_logical_snapshot_info() look
self-explanatory except 'state'. Should we have meaningful 'text' here
corresponding to SnapBuildState? Similar to what we do for
'invalidation_reason' in pg_replication_slots. (SlotInvalidationCauses
for ReplicationSlotInvalidationCause)

thanks
Shveta

#18shveta malik
shveta.malik@gmail.com
In reply to: shveta malik (#17)
Re: Add contrib/pg_logicalsnapinspect

On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com> wrote:

Thanks for addressing the comments. I have not started reviewing v4
yet, but here are few more comments on v3:

I just noticed that when we pass NULL input, both the new functions
give 1 row as output, all cols as NULL:

newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
magic | checksum | version
-------+----------+---------
| |

(1 row)

Similar behavior with pg_get_logical_snapshot_info(). While the
existing 'pg_ls_logicalsnapdir' function gives this error, which looks
more meaningful:

newdb1=# select * from pg_ls_logicalsnapdir(NULL);
ERROR: function pg_ls_logicalsnapdir(unknown) does not exist
LINE 1: select * from pg_ls_logicalsnapdir(NULL);
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.

Shouldn't the new functions have same behavior?

thanks
Shveta

#19David G. Johnston
david.g.johnston@gmail.com
In reply to: shveta malik (#18)
Re: Add contrib/pg_logicalsnapinspect

On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com>
wrote:

Thanks for addressing the comments. I have not started reviewing v4
yet, but here are few more comments on v3:

I just noticed that when we pass NULL input, both the new functions
give 1 row as output, all cols as NULL:

newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
magic | checksum | version
-------+----------+---------
| |

(1 row)

Similar behavior with pg_get_logical_snapshot_info(). While the
existing 'pg_ls_logicalsnapdir' function gives this error, which looks
more meaningful:

newdb1=# select * from pg_ls_logicalsnapdir(NULL);
ERROR: function pg_ls_logicalsnapdir(unknown) does not exist
LINE 1: select * from pg_ls_logicalsnapdir(NULL);
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.

Shouldn't the new functions have same behavior?

No. Since the name pg_ls_logicalsnapdir has zero single-argument
implementations passing a null value as an argument is indeed attempt to
invoke a function signature that doesn’t exist.

If there is exactly one single input argument function of the given name
the parser is going to cast the null literal to the data type of the single
argument and invoke the function. It will not and cannot be convinced to
fail to find a matching function.

I can see an argument that they should produce an empty set instead of a
single all-null row, but the idea that they wouldn’t even be found is
contrary to a core design of the system.

David J.

#20Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: David G. Johnston (#19)
Re: Add contrib/pg_logicalsnapinspect

Hi,

On Mon, Sep 16, 2024 at 10:16:16PM -0700, David G. Johnston wrote:

On Monday, September 16, 2024, shveta malik <shveta.malik@gmail.com> wrote:

On Tue, Sep 17, 2024 at 10:18 AM shveta malik <shveta.malik@gmail.com>
wrote:

Thanks for addressing the comments. I have not started reviewing v4
yet, but here are few more comments on v3:

I just noticed that when we pass NULL input, both the new functions
give 1 row as output, all cols as NULL:

newdb1=# SELECT * FROM pg_get_logical_snapshot_meta(NULL);
magic | checksum | version
-------+----------+---------
| |

(1 row)

Similar behavior with pg_get_logical_snapshot_info(). While the
existing 'pg_ls_logicalsnapdir' function gives this error, which looks
more meaningful:

newdb1=# select * from pg_ls_logicalsnapdir(NULL);
ERROR: function pg_ls_logicalsnapdir(unknown) does not exist
LINE 1: select * from pg_ls_logicalsnapdir(NULL);
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.

Shouldn't the new functions have same behavior?

No. Since the name pg_ls_logicalsnapdir has zero single-argument
implementations passing a null value as an argument is indeed attempt to
invoke a function signature that doesn’t exist.

Agree.

I can see an argument that they should produce an empty set instead of a
single all-null row,

Yeah, it's outside the scope of this patch but I've seen different behavior
in this area.

For example:

postgres=# select * from pg_ls_replslotdir(NULL);
name | size | modification
------+------+--------------
(0 rows)

as compared to:

postgres=# select * from pg_walfile_name_offset(NULL);
file_name | file_offset
-----------+-------------
|
(1 row)

I thought that it might be linked to the volatility but it is not:

postgres=# select * from pg_stat_get_xact_blocks_fetched(NULL);
pg_stat_get_xact_blocks_fetched
---------------------------------

(1 row)

postgres=# select * from pg_get_multixact_members(NULL);
xid | mode
-----+------
(0 rows)

while both are volatile.

I think both make sense: It's "empty" or we "don't know the values of the fields".
I don't know if there is any reason about this "inconsistency".

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

#21Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: shveta malik (#17)
#22shveta malik
shveta.malik@gmail.com
In reply to: David G. Johnston (#19)
#23shveta malik
shveta.malik@gmail.com
In reply to: Bertrand Drouvot (#21)
#24Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: shveta malik (#23)
#25Peter Smith
smithpb2250@gmail.com
In reply to: Bertrand Drouvot (#21)
#26Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Smith (#25)
#27Peter Smith
smithpb2250@gmail.com
In reply to: Bertrand Drouvot (#26)
#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Bertrand Drouvot (#21)
#29shveta malik
shveta.malik@gmail.com
In reply to: Bertrand Drouvot (#26)
#30Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Smith (#27)
#31Peter Smith
smithpb2250@gmail.com
In reply to: Bertrand Drouvot (#30)
#32shveta malik
shveta.malik@gmail.com
In reply to: Bertrand Drouvot (#30)
#33shveta malik
shveta.malik@gmail.com
In reply to: Peter Smith (#31)
#34Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Smith (#31)
#35Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: shveta malik (#32)
#36Peter Smith
smithpb2250@gmail.com
In reply to: Bertrand Drouvot (#34)
#37shveta malik
shveta.malik@gmail.com
In reply to: Bertrand Drouvot (#35)
#38Peter Eisentraut
peter_e@gmx.net
In reply to: Bertrand Drouvot (#34)
#39Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Eisentraut (#38)
#40Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: shveta malik (#37)
#41Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#39)
#42shveta malik
shveta.malik@gmail.com
In reply to: Bertrand Drouvot (#40)
#43Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Masahiko Sawada (#41)
#44Peter Smith
smithpb2250@gmail.com
In reply to: Bertrand Drouvot (#43)
#45Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Smith (#44)
#46Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#45)
#47Peter Smith
smithpb2250@gmail.com
In reply to: Bertrand Drouvot (#45)
#48Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Smith (#47)
#49Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Masahiko Sawada (#46)
#50Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#49)
#51Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Masahiko Sawada (#50)
#52Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#51)
#53Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Masahiko Sawada (#52)
#54Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#53)
#55Peter Smith
smithpb2250@gmail.com
In reply to: Bertrand Drouvot (#53)
#56Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Masahiko Sawada (#54)
#57Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Smith (#55)
#58Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#56)
#59Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Masahiko Sawada (#58)
#60Peter Smith
smithpb2250@gmail.com
In reply to: Masahiko Sawada (#59)
#61Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Masahiko Sawada (#59)
#62Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Peter Smith (#60)
#63Peter Smith
smithpb2250@gmail.com
In reply to: Bertrand Drouvot (#62)
#64Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#62)
#65Andres Freund
andres@anarazel.de
In reply to: Masahiko Sawada (#64)
#66Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andres Freund (#65)
#67Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Masahiko Sawada (#66)
#68Amit Kapila
amit.kapila16@gmail.com
In reply to: Bertrand Drouvot (#67)
#69Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#68)
#70Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bertrand Drouvot (#69)
#71Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Tom Lane (#70)
#72Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Masahiko Sawada (#71)
#73Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#69)
#74Amit Kapila
amit.kapila16@gmail.com
In reply to: Masahiko Sawada (#73)
#75Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Amit Kapila (#74)
#76Amit Kapila
amit.kapila16@gmail.com
In reply to: Bertrand Drouvot (#75)
#77Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#75)
#78Bertrand Drouvot
bertranddrouvot.pg@gmail.com
In reply to: Masahiko Sawada (#77)
#79Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Bertrand Drouvot (#78)
#80Euler Taveira
euler@eulerto.com
In reply to: Masahiko Sawada (#79)
#81Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Euler Taveira (#80)