Corrected documentation of data type for the logical replication message formats.

Started by vignesh Cover 4 years ago25 messages
#1vignesh C
vignesh C
vignesh21@gmail.com
1 attachment(s)

Hi,

For some of the logical replication messages the data type documented
was not correct, especially for lsn and xid. For lsn actual datatype
used is uint64 but is documented as int64, similarly for xid, datatype
used is uint32 but documented as int32.
Attached is a patch which has the fix for the same.
Thoughts?

Regards,
Vignesh

Attachments:

v1-0001-Corrected-data-type-for-the-logical-replication-m.patchtext/x-patch; charset=US-ASCII; name=v1-0001-Corrected-data-type-for-the-logical-replication-m.patch
#2Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#1)
Re: Corrected documentation of data type for the logical replication message formats.

On Sun, May 9, 2021 at 10:38 PM vignesh C <vignesh21@gmail.com> wrote:

Hi,

For some of the logical replication messages the data type documented
was not correct, especially for lsn and xid. For lsn actual datatype
used is uint64 but is documented as int64, similarly for xid, datatype
used is uint32 but documented as int32.
Attached is a patch which has the fix for the same.
Thoughts?

If you want to do this then there are more - e.g. Flags should be
Uint8 instead of Int8.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#3Euler Taveira
Euler Taveira
euler@eulerto.com
In reply to: vignesh C (#1)
Re: Corrected documentation of data type for the logical replication message formats.

On Sun, May 9, 2021, at 9:37 AM, vignesh C wrote:

For some of the logical replication messages the data type documented
was not correct, especially for lsn and xid. For lsn actual datatype
used is uint64 but is documented as int64, similarly for xid, datatype
used is uint32 but documented as int32.
Attached is a patch which has the fix for the same.
Thoughts?

There was a discussion [1]/messages/by-id/CAH503wBwC8A7DbDYUXRqW1ZAHKpj+D9bN7hcgszvP_1FzXbs_Q@mail.gmail.com a few months ago about it. Read the Message Data
Types definition [2]https://www.postgresql.org/docs/current/protocol-message-types.html. It is confusing that an internal data type (int64) has a
name similar to a generic data type in a protocol definition. As I said [1]/messages/by-id/CAH503wBwC8A7DbDYUXRqW1ZAHKpj+D9bN7hcgszvP_1FzXbs_Q@mail.gmail.com we
should probably inform that that piece of information (LSN) is a XLogRecPtr.
Since this chapter is intended for developers, I think it is fine to include
such internal detail.

[1]: /messages/by-id/CAH503wBwC8A7DbDYUXRqW1ZAHKpj+D9bN7hcgszvP_1FzXbs_Q@mail.gmail.com
[2]: https://www.postgresql.org/docs/current/protocol-message-types.html

--
Euler Taveira
EDB https://www.enterprisedb.com/

#4Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: Peter Smith (#2)
Re: Corrected documentation of data type for the logical replication message formats.

On Sun, May 9, 2021 at 11:13 PM Peter Smith <smithpb2250@gmail.com> wrote:

On Sun, May 9, 2021 at 10:38 PM vignesh C <vignesh21@gmail.com> wrote:

Hi,

For some of the logical replication messages the data type documented
was not correct, especially for lsn and xid. For lsn actual datatype
used is uint64 but is documented as int64, similarly for xid, datatype
used is uint32 but documented as int32.
Attached is a patch which has the fix for the same.
Thoughts?

If you want to do this then there are more - e.g. Flags should be
Uint8 instead of Int8.

Irrespective of signed/unsigned, from the description of types [1]https://www.postgresql.org/docs/devel/protocol-message-types.html it
does seem like all those unused "(must be 0)" replication flags ought
to have been written as "Int8(0)" instead of "Int8".

------
[1]: https://www.postgresql.org/docs/devel/protocol-message-types.html

Kind Regards,
Peter Smith.
Fujitsu Australia

#5vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Euler Taveira (#3)
1 attachment(s)
Re: Corrected documentation of data type for the logical replication message formats.

On Sun, May 9, 2021 at 6:54 PM Euler Taveira <euler@eulerto.com> wrote:

On Sun, May 9, 2021, at 9:37 AM, vignesh C wrote:

For some of the logical replication messages the data type documented
was not correct, especially for lsn and xid. For lsn actual datatype
used is uint64 but is documented as int64, similarly for xid, datatype
used is uint32 but documented as int32.
Attached is a patch which has the fix for the same.
Thoughts?

There was a discussion [1] a few months ago about it. Read the Message Data
Types definition [2]. It is confusing that an internal data type (int64) has a
name similar to a generic data type in a protocol definition. As I said [1] we
should probably inform that that piece of information (LSN) is a XLogRecPtr.
Since this chapter is intended for developers, I think it is fine to include
such internal detail.

I agree to specifying the actual dataypes like XLogRecPtr for lsn,
TimestampTz for timestamp, TransactionId for xid and Oid for the
object id. Attached v2 patch which is changed on similar lines.
Thoughts?

Regards,
Vignesh

Attachments:

v2-0001-Corrected-data-type-for-the-logical-replication-m.patchapplication/x-patch; name=v2-0001-Corrected-data-type-for-the-logical-replication-m.patch
#6vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#2)
Re: Corrected documentation of data type for the logical replication message formats.

On Sun, May 9, 2021 at 6:44 PM Peter Smith <smithpb2250@gmail.com> wrote:

On Sun, May 9, 2021 at 10:38 PM vignesh C <vignesh21@gmail.com> wrote:

Hi,

For some of the logical replication messages the data type documented
was not correct, especially for lsn and xid. For lsn actual datatype
used is uint64 but is documented as int64, similarly for xid, datatype
used is uint32 but documented as int32.
Attached is a patch which has the fix for the same.
Thoughts?

If you want to do this then there are more - e.g. Flags should be
Uint8 instead of Int8.

Thanks for the comments.
I have made this change in v2 patch posted at [1]/messages/by-id/CALDaNm2G_BJ9G=Cxy9A6ht-TXPn4nB8W9_BcawuA1uxsNvoWfQ@mail.gmail.com.
This also includes the fix to specify uint8(0) at appropriate places.

[1]: /messages/by-id/CALDaNm2G_BJ9G=Cxy9A6ht-TXPn4nB8W9_BcawuA1uxsNvoWfQ@mail.gmail.com
/messages/by-id/CALDaNm2G_BJ9G=Cxy9A6ht-TXPn4nB8W9_BcawuA1uxsNvoWfQ@mail.gmail.com

Regards,
Vignesh

#7Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#5)
Re: Corrected documentation of data type for the logical replication message formats.

On Mon, May 10, 2021 at 11:46 PM vignesh C <vignesh21@gmail.com> wrote:

On Sun, May 9, 2021 at 6:54 PM Euler Taveira <euler@eulerto.com> wrote:

On Sun, May 9, 2021, at 9:37 AM, vignesh C wrote:

For some of the logical replication messages the data type documented
was not correct, especially for lsn and xid. For lsn actual datatype
used is uint64 but is documented as int64, similarly for xid, datatype
used is uint32 but documented as int32.
Attached is a patch which has the fix for the same.
Thoughts?

There was a discussion [1] a few months ago about it. Read the Message Data
Types definition [2]. It is confusing that an internal data type (int64) has a
name similar to a generic data type in a protocol definition. As I said [1] we
should probably inform that that piece of information (LSN) is a XLogRecPtr.
Since this chapter is intended for developers, I think it is fine to include
such internal detail.

I agree to specifying the actual dataypes like XLogRecPtr for lsn,
TimestampTz for timestamp, TransactionId for xid and Oid for the
object id. Attached v2 patch which is changed on similar lines.
Thoughts?

Adding new message "types" does not seem like a good idea to me. e.g.
All the message types must be defined by the page [1]https://www.postgresql.org/docs/devel/protocol-message-types.html so if you add
new ones then they should also be defined on that page. But then how
many other places ought to make use of those new types? IMO this
approach will snowball out of control.

But I am also doubtful there was ever actually a (signed/unsigned)
problem in the first place. AFAIK the message types like "Int32" etc
just happen to have a name that "looks" like a C type, but I think
that is the extent of it. It is simply saying how data bytes are
transferred on the wire. All the low level C functions [2]https://linux.die.net/man/3/ntohl always deal
with unsigned.

~~

My suggestion would be to restrict your changes to the *description*
parts of each message. e.g. maybe you could say what C type the bytes
represent when they come off the wire at the other end - something
like below.

BEFORE
Int64
The final LSN of the transaction.

AFTER
Int64
The final LSN (XLogRecPtr) of the transaction

------
[1]: https://www.postgresql.org/docs/devel/protocol-message-types.html
[2]: https://linux.die.net/man/3/ntohl

Kind Regards,
Peter Smith.
Fujitsu Australia.

#8Euler Taveira
Euler Taveira
euler@eulerto.com
In reply to: vignesh C (#5)
Re: Corrected documentation of data type for the logical replication message formats.

On Mon, May 10, 2021, at 10:45 AM, vignesh C wrote:

I agree to specifying the actual dataypes like XLogRecPtr for lsn,
TimestampTz for timestamp, TransactionId for xid and Oid for the
object id. Attached v2 patch which is changed on similar lines.
Thoughts?

Perhaps I didn't make myself clear, I didn't suggest to replace the actual
message data types [1]https://www.postgresql.org/docs/current/protocol-message-types.html with the internal representation. We could probably
expand the description to include the internal representation. Hence, it is
less confusing that the actual text. Peter suggested the same in a previous
email.

Although, "Message Data Types" is one section before "Message Formats", it is
probably intuitive that the data type for each message refer to the previous
section. However, it is not so clear three section later. A sentence like

The base data types used are described in the section "Messages Data Types".

at the first paragraph could help understand what these data types refer to
(and also add a link to the data types section).

[1]: https://www.postgresql.org/docs/current/protocol-message-types.html

--
Euler Taveira
EDB https://www.enterprisedb.com/

#9vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#7)
1 attachment(s)
Re: Corrected documentation of data type for the logical replication message formats.

On Tue, May 11, 2021 at 8:06 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Mon, May 10, 2021 at 11:46 PM vignesh C <vignesh21@gmail.com> wrote:

On Sun, May 9, 2021 at 6:54 PM Euler Taveira <euler@eulerto.com> wrote:

On Sun, May 9, 2021, at 9:37 AM, vignesh C wrote:

For some of the logical replication messages the data type documented
was not correct, especially for lsn and xid. For lsn actual datatype
used is uint64 but is documented as int64, similarly for xid, datatype
used is uint32 but documented as int32.
Attached is a patch which has the fix for the same.
Thoughts?

There was a discussion [1] a few months ago about it. Read the Message Data
Types definition [2]. It is confusing that an internal data type (int64) has a
name similar to a generic data type in a protocol definition. As I said [1] we
should probably inform that that piece of information (LSN) is a XLogRecPtr.
Since this chapter is intended for developers, I think it is fine to include
such internal detail.

I agree to specifying the actual dataypes like XLogRecPtr for lsn,
TimestampTz for timestamp, TransactionId for xid and Oid for the
object id. Attached v2 patch which is changed on similar lines.
Thoughts?

Adding new message "types" does not seem like a good idea to me. e.g.
All the message types must be defined by the page [1] so if you add
new ones then they should also be defined on that page. But then how
many other places ought to make use of those new types? IMO this
approach will snowball out of control.

But I am also doubtful there was ever actually a (signed/unsigned)
problem in the first place. AFAIK the message types like "Int32" etc
just happen to have a name that "looks" like a C type, but I think
that is the extent of it. It is simply saying how data bytes are
transferred on the wire. All the low level C functions [2] always deal
with unsigned.

~~

My suggestion would be to restrict your changes to the *description*
parts of each message. e.g. maybe you could say what C type the bytes
represent when they come off the wire at the other end - something
like below.

BEFORE
Int64
The final LSN of the transaction.

AFTER
Int64
The final LSN (XLogRecPtr) of the transaction

Thanks for the comments, Attached v3 patch has the changes as suggested.

Regards,
Vignesh

Attachments:

v3-0001-Included-the-actual-datatype-used-in-logical-repl.patchtext/x-patch; charset=US-ASCII; name=v3-0001-Included-the-actual-datatype-used-in-logical-repl.patch
#10vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Euler Taveira (#8)
Re: Corrected documentation of data type for the logical replication message formats.

On Tue, May 11, 2021 at 9:09 AM Euler Taveira <euler@eulerto.com> wrote:

On Mon, May 10, 2021, at 10:45 AM, vignesh C wrote:

I agree to specifying the actual dataypes like XLogRecPtr for lsn,
TimestampTz for timestamp, TransactionId for xid and Oid for the
object id. Attached v2 patch which is changed on similar lines.
Thoughts?

Perhaps I didn't make myself clear, I didn't suggest to replace the actual
message data types [1] with the internal representation. We could probably
expand the description to include the internal representation. Hence, it

is

less confusing that the actual text. Peter suggested the same in a

previous

email.

Although, "Message Data Types" is one section before "Message Formats",

it is

probably intuitive that the data type for each message refer to the

previous

section. However, it is not so clear three section later. A sentence like

The base data types used are described in the section "Messages Data

Types".

at the first paragraph could help understand what these data types refer

to

(and also add a link to the data types section).

I have included this at the beginning, the same is available in the patch
posted at [1]/messages/by-id/CALDaNm2QrB-_96ohonQs-YADC9Puk3caXjn+2UYZwxAkX=REQQ@mail.gmail.com.
[1]: /messages/by-id/CALDaNm2QrB-_96ohonQs-YADC9Puk3caXjn+2UYZwxAkX=REQQ@mail.gmail.com
/messages/by-id/CALDaNm2QrB-_96ohonQs-YADC9Puk3caXjn+2UYZwxAkX=REQQ@mail.gmail.com

Regards,
Vignesh

#11Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#9)
Re: Corrected documentation of data type for the logical replication message formats.

On Wed, May 12, 2021 at 1:02 AM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, Attached v3 patch has the changes as suggested.

This v3 mostly looks good to me now except for some minor comments
about the flags.

~~~

1. Commit flags

@@ -6534,11 +6536,11 @@ Commit
 </varlistentry>
 <varlistentry>
 <term>
-        Int8
+        Uint8(0)
 </term>
 <listitem>
 <para>
-                Flags; currently unused (must be 0).
+                Flags (uint8(0)); currently unused (must be 0).
 </para>
 </listitem>
 </varlistentry>

a) There is no data type Uint8. That should be "Int8(0)"

b) I think the "(0)" does not belong in the description part.
e.g. change to "Flags (uint8); currently unused (must be 0)."

~~~

2. Stream Commit flags

@@ -7276,7 +7284,7 @@ Stream Commit
 </term>
 <listitem>
 <para>
-                Flags; currently unused (must be 0).
+                Flags (uint8(0)); currently unused (must be 0).
 </para>
 </listitem>
 </varlistentry>

a) The data type should say "Int8(0)"

b) I think the "(0)" does not belong in the description part.
e.g. change to "Flags (uint8); currently unused (must be 0)."

~~~

3. I felt that saying "(must be 0)" for those unused flag descriptions
is unnecessary since that is exactly what "Int8(0)" already means.
e.g. consider change to just say "Flags (uint8); currently unused."

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#12vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#11)
1 attachment(s)
Re: Corrected documentation of data type for the logical replication message formats.

On Wed, May 12, 2021 at 3:36 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, May 12, 2021 at 1:02 AM vignesh C <vignesh21@gmail.com> wrote:

Thanks for the comments, Attached v3 patch has the changes as suggested.

This v3 mostly looks good to me now except for some minor comments
about the flags.

~~~

1. Commit flags

@@ -6534,11 +6536,11 @@ Commit
</varlistentry>
<varlistentry>
<term>
-        Int8
+        Uint8(0)
</term>
<listitem>
<para>
-                Flags; currently unused (must be 0).
+                Flags (uint8(0)); currently unused (must be 0).
</para>
</listitem>
</varlistentry>

a) There is no data type Uint8. That should be "Int8(0)"

Modified.

b) I think the "(0)" does not belong in the description part.
e.g. change to "Flags (uint8); currently unused (must be 0)."

Modified

~~~

2. Stream Commit flags

@@ -7276,7 +7284,7 @@ Stream Commit
</term>
<listitem>
<para>
-                Flags; currently unused (must be 0).
+                Flags (uint8(0)); currently unused (must be 0).
</para>
</listitem>
</varlistentry>

a) The data type should say "Int8(0)"

Modified.

b) I think the "(0)" does not belong in the description part.
e.g. change to "Flags (uint8); currently unused (must be 0)."

Modified.

~~~

3. I felt that saying "(must be 0)" for those unused flag descriptions
is unnecessary since that is exactly what "Int8(0)" already means.
e.g. consider change to just say "Flags (uint8); currently unused."

Modified.

Thanks for the comments. Attached v4 patch has the fix for the same.

Regards,
Vignesh

Attachments:

v4-0001-Included-the-actual-datatype-used-in-logical-repl.patchapplication/x-patch; name=v4-0001-Included-the-actual-datatype-used-in-logical-repl.patch
#13Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#12)
Re: Corrected documentation of data type for the logical replication message formats.

On Wed, May 12, 2021 at 11:09 PM vignesh C <vignesh21@gmail.com> wrote:
...

Thanks for the comments. Attached v4 patch has the fix for the same.

I have not tried this patch so I cannot confirm whether it applies or
renders OK, but just going by the v4 content this now LGTM.

--------
Kind Regards,
Peter Smith.
Fujitsu Australia

#14vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#13)
Re: Corrected documentation of data type for the logical replication message formats.

On Thu, May 13, 2021 at 5:57 AM Peter Smith <smithpb2250@gmail.com> wrote:

On Wed, May 12, 2021 at 11:09 PM vignesh C <vignesh21@gmail.com> wrote:
...

Thanks for the comments. Attached v4 patch has the fix for the same.

I have not tried this patch so I cannot confirm whether it applies or
renders OK, but just going by the v4 content this now LGTM.

Thanks for having a look at it.
I have added a commitfest entry for this:
https://commitfest.postgresql.org/33/3117/

Regards,
Vignesh

#15Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#14)
Re: Corrected documentation of data type for the logical replication message formats.

Hi Vignesh.

FYI - Because the other patch [1]https://github.com/postgres/postgres/commit/a8fd13cab0ba815e9925dc9676e6309f699b5f72#diff-331c33fd11c3ed85f9dbfead93f139c20ff3a25176651fc2ed37c486b97630e6 was pushed ahead of this one, I
think your patch now needs to be updated for the new message types
that were introduced.

------
[1]: https://github.com/postgres/postgres/commit/a8fd13cab0ba815e9925dc9676e6309f699b5f72#diff-331c33fd11c3ed85f9dbfead93f139c20ff3a25176651fc2ed37c486b97630e6

Kind Regards,
Peter Smith.
Fujitsu Australia

#16vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#15)
1 attachment(s)
Re: Corrected documentation of data type for the logical replication message formats.

On Fri, Jul 16, 2021 at 8:52 AM Peter Smith <smithpb2250@gmail.com> wrote:

Hi Vignesh.

FYI - Because the other patch [1] was pushed ahead of this one, I
think your patch now needs to be updated for the new message types
that were introduced.

Thanks, I have made the changes for the same in the v5 patch attached.

Regards,
Vignesh

Attachments:

v5-0001-Included-the-actual-datatype-used-in-logical-repl.patchapplication/x-patch; name=v5-0001-Included-the-actual-datatype-used-in-logical-repl.patch
#17Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#16)
Re: Corrected documentation of data type for the logical replication message formats.

I think the patch maybe is not quite correct for all the flags.

For example,

@@ -7607,44 +7615,44 @@ are available since protocol version 3.
 <varlistentry>
 <term>Int8</term>
 <listitem><para>
-                Flags; currently unused (must be 0).
+                Flags (uint8); currently unused.
 </para></listitem>
 </varlistentry>

AFAIK, even though the flags are "unused", the code still insists that
most (or all? Please check the code) of these flag values MUST be 0,
so I think that this zero value requirement ought to be indicated in
the docs using the "Int8(0)" convention [1]https://www.postgresql.org/docs/devel/protocol-message-types.html. For example,

BEFORE
Int8
Flags (uint8); currently unused.

AFTER
Int8(0)
Flags (uint8); currently unused.

------
[1]: https://www.postgresql.org/docs/devel/protocol-message-types.html

Kind Regards,
Peter Smith.
Fujitsu Australia.

#18vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#17)
1 attachment(s)
Re: Corrected documentation of data type for the logical replication message formats.

On Fri, Jul 23, 2021 at 3:23 AM Peter Smith <smithpb2250@gmail.com> wrote:

I think the patch maybe is not quite correct for all the flags.

For example,

@@ -7607,44 +7615,44 @@ are available since protocol version 3.
<varlistentry>
<term>Int8</term>
<listitem><para>
-                Flags; currently unused (must be 0).
+                Flags (uint8); currently unused.
</para></listitem>
</varlistentry>

AFAIK, even though the flags are "unused", the code still insists that
most (or all? Please check the code) of these flag values MUST be 0,
so I think that this zero value requirement ought to be indicated in
the docs using the "Int8(0)" convention [1]. For example,

BEFORE
Int8
Flags (uint8); currently unused.

AFTER
Int8(0)
Flags (uint8); currently unused.

Thanks for the comments. Attached v6 patch has the changes for the same.

Regards,
Vignesh

Attachments:

v6-0001-Included-the-actual-datatype-used-in-logical-repl.patchtext/x-patch; charset=US-ASCII; name=v6-0001-Included-the-actual-datatype-used-in-logical-repl.patch
#19Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: vignesh C (#18)
Re: Corrected documentation of data type for the logical replication message formats.

vignesh C <vignesh21@gmail.com> writes:
[ v6-0001-Included-the-actual-datatype-used-in-logical-repl.patch ]

I see what you want to do here, but the way you did it seems quite
detrimental to the readability of the field descriptions.
Parenthesized interjections should be used sparingly.

I'm inclined to think that the equivalent data type is part of the
field data type specification, and thus that we ought to put it in
the data type part of each entry. So we'd have something like

<varlistentry>
<term>
Int64 (XLogRecPtr)
</term>
<listitem>
<para>
The final LSN of the transaction.
</para>
</listitem>
</varlistentry>

instead of what you did here. Parentheses might not be the best
punctuation to use, given the existing convention about parenthesized
specific values, but we could probably settle on some other markup.
Or just ignore the ambiguity.

Another idea is to add the data type info at the ends of items
instead of cramming it into the sentences, thus:

The final LSN of the transaction. (XLogRecPtr)

I don't find that better personally, but maybe others will
think differently.

regards, tom lane

#20Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: Tom Lane (#19)
Re: Corrected documentation of data type for the logical replication message formats.

On Sat, Jul 31, 2021 at 7:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

vignesh C <vignesh21@gmail.com> writes:
[ v6-0001-Included-the-actual-datatype-used-in-logical-repl.patch ]

I see what you want to do here, but the way you did it seems quite
detrimental to the readability of the field descriptions.
Parenthesized interjections should be used sparingly.

I'm inclined to think that the equivalent data type is part of the
field data type specification, and thus that we ought to put it in
the data type part of each entry. So we'd have something like

<varlistentry>
<term>
Int64 (XLogRecPtr)
</term>
<listitem>
<para>
The final LSN of the transaction.
</para>
</listitem>
</varlistentry>

instead of what you did here. Parentheses might not be the best
punctuation to use, given the existing convention about parenthesized
specific values, but we could probably settle on some other markup.
Or just ignore the ambiguity.

+1 to change it like suggested above.

The specific value for the flags might then look like below, but that
does not look too bad to me.

<term>
Int8 (uint8) (0)
</term>

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#21vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Tom Lane (#19)
1 attachment(s)
Re: Corrected documentation of data type for the logical replication message formats.

On Sat, Jul 31, 2021 at 2:30 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

vignesh C <vignesh21@gmail.com> writes:
[ v6-0001-Included-the-actual-datatype-used-in-logical-repl.patch ]

I see what you want to do here, but the way you did it seems quite
detrimental to the readability of the field descriptions.
Parenthesized interjections should be used sparingly.

I'm inclined to think that the equivalent data type is part of the
field data type specification, and thus that we ought to put it in
the data type part of each entry. So we'd have something like

<varlistentry>
<term>
Int64 (XLogRecPtr)
</term>
<listitem>
<para>
The final LSN of the transaction.
</para>
</listitem>
</varlistentry>

I made changes based on the feedback, since Peter also was in favour
of using this approach, I modified based on the first approach.
Attached v7 patch has the changes for the same.

Regards,
Vignesh

Attachments:

v7-0001-Included-the-actual-datatype-used-in-logical-repl.patchtext/x-patch; charset=US-ASCII; name=v7-0001-Included-the-actual-datatype-used-in-logical-repl.patch
#22vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Peter Smith (#20)
Re: Corrected documentation of data type for the logical replication message formats.

On Sun, Aug 1, 2021 at 4:11 PM Peter Smith <smithpb2250@gmail.com> wrote:

On Sat, Jul 31, 2021 at 7:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

vignesh C <vignesh21@gmail.com> writes:
[ v6-0001-Included-the-actual-datatype-used-in-logical-repl.patch ]

I see what you want to do here, but the way you did it seems quite
detrimental to the readability of the field descriptions.
Parenthesized interjections should be used sparingly.

I'm inclined to think that the equivalent data type is part of the
field data type specification, and thus that we ought to put it in
the data type part of each entry. So we'd have something like

<varlistentry>
<term>
Int64 (XLogRecPtr)
</term>
<listitem>
<para>
The final LSN of the transaction.
</para>
</listitem>
</varlistentry>

instead of what you did here. Parentheses might not be the best
punctuation to use, given the existing convention about parenthesized
specific values, but we could probably settle on some other markup.
Or just ignore the ambiguity.

+1 to change it like suggested above.

The specific value for the flags might then look like below, but that
does not look too bad to me.

<term>
Int8 (uint8) (0)
</term>

I felt we can change it like:
<term>
Int8(0) (uint8)
</term>

I felt the flag value can be kept first followed by the data type since it
is used similarly for the other message types like below:
<term>
Byte1('C')
</term>

I have made changes in similar lines and posted the patch at [1]/messages/by-id/CALDaNm3sK75Mo+VzLmNGe29gYtJoeKHshAK0GDiAzfAj6LQPdw@mail.gmail.com.
Thoughts?

[1]: /messages/by-id/CALDaNm3sK75Mo+VzLmNGe29gYtJoeKHshAK0GDiAzfAj6LQPdw@mail.gmail.com
/messages/by-id/CALDaNm3sK75Mo+VzLmNGe29gYtJoeKHshAK0GDiAzfAj6LQPdw@mail.gmail.com

Regards,
Vignesh

#23Peter Smith
Peter Smith
smithpb2250@gmail.com
In reply to: vignesh C (#22)
Re: Corrected documentation of data type for the logical replication message formats.

On Mon, Aug 2, 2021 at 1:32 AM vignesh C <vignesh21@gmail.com> wrote:

On Sun, Aug 1, 2021 at 4:11 PM Peter Smith <smithpb2250@gmail.com> wrote:

On Sat, Jul 31, 2021 at 7:00 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

vignesh C <vignesh21@gmail.com> writes:
[ v6-0001-Included-the-actual-datatype-used-in-logical-repl.patch ]

I see what you want to do here, but the way you did it seems quite
detrimental to the readability of the field descriptions.
Parenthesized interjections should be used sparingly.

I'm inclined to think that the equivalent data type is part of the
field data type specification, and thus that we ought to put it in
the data type part of each entry. So we'd have something like

<varlistentry>
<term>
Int64 (XLogRecPtr)
</term>
<listitem>
<para>
The final LSN of the transaction.
</para>
</listitem>
</varlistentry>

instead of what you did here. Parentheses might not be the best
punctuation to use, given the existing convention about parenthesized
specific values, but we could probably settle on some other markup.
Or just ignore the ambiguity.

+1 to change it like suggested above.

The specific value for the flags might then look like below, but that
does not look too bad to me.

<term>
Int8 (uint8) (0)
</term>

I felt we can change it like:
<term>
Int8(0) (uint8)
</term>

I felt the flag value can be kept first followed by the data type since it is used similarly for the other message types like below:
<term>
Byte1('C')
</term>

I have made changes in similar lines and posted the patch at [1].
Thoughts?

I agree. The specified value looks better when it comes first, as you did it.

~~

Option #1:
Int8(0) (uint8)
Int64 (XLogRecPtr)

Option #2:
Int8(0) [uint8]
Int64 [XLogRecPtr]

Option #3:
Int8(0) -- uint
Int64 -- XLogRecPtr

etc...

Probably my slight favourite is Option #2 above, but YMMV. Any format
you choose which is similar to those above is fine by me.

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#24Tom Lane
Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Smith (#23)
Re: Corrected documentation of data type for the logical replication message formats.

Peter Smith <smithpb2250@gmail.com> writes:

I agree. The specified value looks better when it comes first, as you did it.

Actually, it looks to me like we don't have to resolve the question of
which should come first, because I don't see any cases where it's
useful to have both. I don't agree with appending "uint8" to those
field descriptions, because it's adding no information, especially
when the high bit couldn't be set anyway.

At some point it might be useful to add UInt<n> to the set of base
data types, and then go through all the message types and decide
which fields we think are unsigned. But that is not this patch,
and there would be questions about whether it constituted a protocol
break.

I noticed also that having to add "(Oid)" was sort of self-inflicted
damage, because the field descriptions were using the very vague
term "ID", when they could have said "OID" and been clear. I left
the "(Oid)" additions in place but also changed the text.

Pushed with those changes. I couldn't resist copy-editing the section
intro, too.

regards, tom lane

#25vignesh C
vignesh C
vignesh21@gmail.com
In reply to: Tom Lane (#24)
Re: Corrected documentation of data type for the logical replication message formats.

On Mon, Aug 2, 2021 at 9:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Peter Smith <smithpb2250@gmail.com> writes:

I agree. The specified value looks better when it comes first, as you did it.

Actually, it looks to me like we don't have to resolve the question of
which should come first, because I don't see any cases where it's
useful to have both. I don't agree with appending "uint8" to those
field descriptions, because it's adding no information, especially
when the high bit couldn't be set anyway.

At some point it might be useful to add UInt<n> to the set of base
data types, and then go through all the message types and decide
which fields we think are unsigned. But that is not this patch,
and there would be questions about whether it constituted a protocol
break.

I noticed also that having to add "(Oid)" was sort of self-inflicted
damage, because the field descriptions were using the very vague
term "ID", when they could have said "OID" and been clear. I left
the "(Oid)" additions in place but also changed the text.

Pushed with those changes. I couldn't resist copy-editing the section
intro, too.

Thanks for pushing the patch.

Regards,
Vignesh