Update docs for UUID data type

Started by Andrew Alsupabout 1 year ago13 messageshackers
Jump to latest
#1Andrew Alsup
bluesbreaker@gmail.com

The attached patch file makes some modest changes to the docs for the UUID
data type (Section 8.12. UUID Type). The main goal is to inform the reader
that there are multiple versions of UUID generation algorithms (presently
8); however, once generated, PostgreSQL treats all UUIDs uniformly.

Regards,
Andy Alsup

Attachments:

0001-UUID-datatype-docs.patchapplication/octet-stream; name=0001-UUID-datatype-docs.patchDownload+15-7
#2Andrew Alsup
bluesbreaker@gmail.com
In reply to: Andrew Alsup (#1)
Re: Update docs for UUID data type

Please find the attached patch files that supersede the previous email.

Patch 0001 contains some modest modifications to the UUID data type docs
(Section 8.12. UUID Type). The main goal is to inform the reader that there
are multiple versions of UUID generation algorithms (presently 8); however,
once generated, PostgreSQL treats all UUIDs uniformly.

Patch 0002 contains modifications to the UUID functions docs (Section 9.14.
UUID Functions). The main goal is to format the UUID functions in table
form, similar to other function docs, such as Section 9.4. String Functions
and Operators. This provides the user a more consistent format, in line
with more established sections of the PostgreSQL documentation.

Thank you for your time and consideration.

Regards,
Andy Alsup

On Fri, Feb 21, 2025 at 11:42 PM Andy Alsup <bluesbreaker@gmail.com> wrote:

Show quoted text

The attached patch file makes some modest changes to the docs for the UUID
data type (Section 8.12. UUID Type). The main goal is to inform the reader
that there are multiple versions of UUID generation algorithms (presently
8); however, once generated, PostgreSQL treats all UUIDs uniformly.

Regards,
Andy Alsup

Attachments:

0001-docs-for-UUID-datatype-mention-UUID-versions.patchapplication/octet-stream; name=0001-docs-for-UUID-datatype-mention-UUID-versions.patchDownload+15-7
0002-docs-for-UUID-funcs-formatted-in-table.patchapplication/octet-stream; name=0002-docs-for-UUID-funcs-formatted-in-table.patchDownload+141-37
#3Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Andrew Alsup (#2)
Re: Update docs for UUID data type

On Sun, 2025-02-23 at 22:23 -0500, Andy Alsup wrote:

Please find the attached patch files that supersede the previous email. 

Patch 0001 contains some modest modifications to the UUID data type docs
(Section 8.12. UUID Type). The main goal is to inform the reader that there
are multiple versions of UUID generation algorithms (presently 8); however,
once generated, PostgreSQL treats all UUIDs uniformly.

Patch 0002 contains modifications to the UUID functions docs (Section 9.14.
UUID Functions). The main goal is to format the UUID functions in table form,
similar to other function docs, such as Section 9.4. String Functions and
Operators. This provides the user a more consistent format, in line with
more established sections of the PostgreSQL documentation.

Thank you for your time and consideration.

I had a look at the patches.

About the first patch:

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 87679dc4a11..9841b125e06 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4399,12 +4399,21 @@ SELECT to_tsvector( 'postgraduate' ), to_tsquery( 'postgres:*' );
ISO/IEC 9834-8:2005, and related standards.
(Some systems refer to this data type as a globally unique identifier, or
GUID,<indexterm><primary>GUID</primary></indexterm> instead.)  This
-    identifier is a 128-bit quantity that is generated by an algorithm chosen
-    to make it very unlikely that the same identifier will be generated by
-    anyone else in the known universe using the same algorithm.  Therefore,
-    for distributed systems, these identifiers provide a better uniqueness
-    guarantee than sequence generators, which
-    are only unique within a single database.
+    identifier is a 128-bit quantity generated by an algorithm chosen to make it
+    extremely unlikely that the same identifier will be generated by any other system.
+    Therefore, for distributed systems, these identifiers offer better uniqueness
+    guarantees than sequence generators, which only guarantee uniqueness within a
+    single database.
+   </para>
+
+   <para>
+    The UUID RFC defines 8 discrete UUID versions. Each version has specific requirements
+    for generating new UUID values, and each version provides distinct benefits and drawbacks.
+    PostgreSQL provides native support for generating UUIDs using the UUIDv4 and
+    UUIDv7 algorithms. Alternatively, UUID values can be generated outside of the
+    PostgreSQL database using any algorithm. In any case, PostgreSQL supports the
+    <type>uuid</type> datatype uniformly, regardless of the UUID version or whether it
+    was generated internally or externally.

"PostgreSQL" should wear a <productname> tag.

Your change to the first paragraph is just the removal of "that is" and
rearranging the line breaks. I don't think that the wording becomes any
clearer through that change, and it makes reading the patch more difficult.
It is a good idea to change as little as possible in the existing text
(particularly in the line breaks), so that reviewing becomes easier.

About the new paragraph: it should be "different", not "discrete".

I am not certain if the part after "alternatively" adds any relevant
information. Also, I am not certain what you mean with "uniformly".
Perhaps that sentence could be

The PostgreSQL data type <type>uuid</type> supports all kinds of UUIDs,
regardless of their version.

We don't mention that "integer" can be used to store integers generated
inside and outside PostgreSQL, so I don't think we need to mention that
here.

About the second patch:

A table is a good thing. We typically have an introductory paragraph
before such tables that contains a hyperlink to the table, something like

<xref ...> shows the <productname>PostgreSQL</productname> functions
that can be used to generate UUIDs:

Yours,
Laurenz Albe

--

*E-Mail Disclaimer*
Der Inhalt dieser E-Mail ist ausschliesslich fuer den
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat
dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte,
dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung oder
Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich
in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen.

*CONFIDENTIALITY NOTICE & DISCLAIMER
*This message and any attachment are
confidential and may be privileged or otherwise protected from disclosure
and solely for the use of the person(s) or entity to whom it is intended.
If you have received this message in error and are not the intended
recipient, please notify the sender immediately and delete this message and
any attachment from your system. If you are not the intended recipient, be
advised that any use of this message is prohibited and may be unlawful, and
you must not copy this message or attachment or disclose the contents to
any other person.

#4Andrew Alsup
bluesbreaker@gmail.com
In reply to: Laurenz Albe (#3)
Re: Update docs for UUID data type

Please find the attached patch, which only addresses the UUID functions (in
table format). I appreciate the comments related to the UUID datatype. If
you feel like the additional content didn't add clarity, I certainly won't
argue.

Best regards,
Andy Alsup

On Mon, Feb 24, 2025 at 2:02 AM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Show quoted text

On Sun, 2025-02-23 at 22:23 -0500, Andy Alsup wrote:

Please find the attached patch files that supersede the previous email.

Patch 0001 contains some modest modifications to the UUID data type docs
(Section 8.12. UUID Type). The main goal is to inform the reader that

there

are multiple versions of UUID generation algorithms (presently 8);

however,

once generated, PostgreSQL treats all UUIDs uniformly.

Patch 0002 contains modifications to the UUID functions docs

(Section 9.14.

UUID Functions). The main goal is to format the UUID functions in table

form,

similar to other function docs, such as Section 9.4. String Functions and
Operators. This provides the user a more consistent format, in line with
more established sections of the PostgreSQL documentation.

Thank you for your time and consideration.

I had a look at the patches.

About the first patch:

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 87679dc4a11..9841b125e06 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4399,12 +4399,21 @@ SELECT to_tsvector( 'postgraduate' ),

to_tsquery( 'postgres:*' );

ISO/IEC 9834-8:2005, and related standards.
(Some systems refer to this data type as a globally unique

identifier, or

GUID,<indexterm><primary>GUID</primary></indexterm> instead.) This
- identifier is a 128-bit quantity that is generated by an algorithm

chosen

- to make it very unlikely that the same identifier will be generated

by

- anyone else in the known universe using the same algorithm.

Therefore,

- for distributed systems, these identifiers provide a better

uniqueness

-    guarantee than sequence generators, which
-    are only unique within a single database.
+    identifier is a 128-bit quantity generated by an algorithm chosen

to make it

+ extremely unlikely that the same identifier will be generated by

any other system.

+ Therefore, for distributed systems, these identifiers offer better

uniqueness

+ guarantees than sequence generators, which only guarantee

uniqueness within a

+    single database.
+   </para>
+
+   <para>
+    The UUID RFC defines 8 discrete UUID versions. Each version has

specific requirements

+ for generating new UUID values, and each version provides distinct

benefits and drawbacks.

+ PostgreSQL provides native support for generating UUIDs using the

UUIDv4 and

+ UUIDv7 algorithms. Alternatively, UUID values can be generated

outside of the

+ PostgreSQL database using any algorithm. In any case, PostgreSQL

supports the

+ <type>uuid</type> datatype uniformly, regardless of the UUID

version or whether it

+ was generated internally or externally.

"PostgreSQL" should wear a <productname> tag.

Your change to the first paragraph is just the removal of "that is" and
rearranging the line breaks. I don't think that the wording becomes any
clearer through that change, and it makes reading the patch more difficult.
It is a good idea to change as little as possible in the existing text
(particularly in the line breaks), so that reviewing becomes easier.

About the new paragraph: it should be "different", not "discrete".

I am not certain if the part after "alternatively" adds any relevant
information. Also, I am not certain what you mean with "uniformly".
Perhaps that sentence could be

The PostgreSQL data type <type>uuid</type> supports all kinds of UUIDs,
regardless of their version.

We don't mention that "integer" can be used to store integers generated
inside and outside PostgreSQL, so I don't think we need to mention that
here.

About the second patch:

A table is a good thing. We typically have an introductory paragraph
before such tables that contains a hyperlink to the table, something like

<xref ...> shows the <productname>PostgreSQL</productname> functions
that can be used to generate UUIDs:

Yours,
Laurenz Albe

--

*E-Mail Disclaimer*
Der Inhalt dieser E-Mail ist ausschliesslich fuer den
bezeichneten Adressaten bestimmt. Wenn Sie nicht der vorgesehene Adressat
dieser E-Mail oder dessen Vertreter sein sollten, so beachten Sie bitte,
dass jede Form der Kenntnisnahme, Veroeffentlichung, Vervielfaeltigung
oder
Weitergabe des Inhalts dieser E-Mail unzulaessig ist. Wir bitten Sie, sich
in diesem Fall mit dem Absender der E-Mail in Verbindung zu setzen.

*CONFIDENTIALITY NOTICE & DISCLAIMER
*This message and any attachment are
confidential and may be privileged or otherwise protected from disclosure
and solely for the use of the person(s) or entity to whom it is intended.
If you have received this message in error and are not the intended
recipient, please notify the sender immediately and delete this message
and
any attachment from your system. If you are not the intended recipient, be
advised that any use of this message is prohibited and may be unlawful,
and
you must not copy this message or attachment or disclose the contents to
any other person.

Attachments:

v2-0001-docs-for-UUID-funcs-formatted-in-table.patchapplication/octet-stream; name=v2-0001-docs-for-UUID-funcs-formatted-in-table.patchDownload+153-37
#5Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Andrew Alsup (#4)
Re: Update docs for UUID data type

On Mon, 2025-02-24 at 21:04 -0500, Andy Alsup wrote:

Please find the attached patch, which only addresses the UUID functions
(in table format). I appreciate the comments related to the UUID datatype.
If you feel like the additional content didn't add clarity, I certainly won't argue.

Your patch looks good to me.

I didn't mean that adding more information about the "uuid" data type is
a bad thing. Perhaps that additional paragraph could be

RFC 9562 defines 8 different UUID versions. Each version has specific requirements
for generating new UUID values, and each version provides distinct benefits and drawbacks.
<productname>PostgreSQL</productname> provides native support for generating UUIDs
using the UUIDv4 and UUIDv7 algorithms. Alternatively, UUID values can be generated
outside of the database using any algorithm. The data type <type>uuid</type> can be used
to store any UUID, regardless of the origin and the UUID version.

I would be happy if you added something like that again.

Yours,
Laurenz Albe

#6Andrew Alsup
bluesbreaker@gmail.com
In reply to: Laurenz Albe (#5)
Re: Update docs for UUID data type

Thank you for the clarification, and the well-worded paragraph. Please find
the latest patch files attached.

Best regards,
Andy Alsup

On Tue, Feb 25, 2025 at 12:41 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Show quoted text

On Mon, 2025-02-24 at 21:04 -0500, Andy Alsup wrote:

Please find the attached patch, which only addresses the UUID functions
(in table format). I appreciate the comments related to the UUID

datatype.

If you feel like the additional content didn't add clarity, I certainly

won't argue.

Your patch looks good to me.

I didn't mean that adding more information about the "uuid" data type is
a bad thing. Perhaps that additional paragraph could be

RFC 9562 defines 8 different UUID versions. Each version has specific
requirements
for generating new UUID values, and each version provides distinct
benefits and drawbacks.
<productname>PostgreSQL</productname> provides native support for
generating UUIDs
using the UUIDv4 and UUIDv7 algorithms. Alternatively, UUID values
can be generated
outside of the database using any algorithm. The data type
<type>uuid</type> can be used
to store any UUID, regardless of the origin and the UUID version.

I would be happy if you added something like that again.

Yours,
Laurenz Albe

Attachments:

v3-0001-docs-for-UUID-funcs-formatted-in-table.patchapplication/octet-stream; name=v3-0001-docs-for-UUID-funcs-formatted-in-table.patchDownload+153-37
v3-0002-docs-for-UUID-datatype-mention-UUID-versions.patchapplication/octet-stream; name=v3-0002-docs-for-UUID-datatype-mention-UUID-versions.patchDownload+9-1
#7Laurenz Albe
laurenz.albe@cybertec.at
In reply to: Andrew Alsup (#6)
Re: Update docs for UUID data type

On Wed, 2025-02-26 at 22:11 -0500, Andy Alsup wrote:

Please find the latest patch files attached.

This is good to go. If you add it to the commitfest, I'm happy to
mark it "ready for committer".

Yours,
Laurenz Albe

#8Andrew Alsup
bluesbreaker@gmail.com
In reply to: Laurenz Albe (#7)
Re: Update docs for UUID data type

I've submitted it for the up-coming commitfest. The link is:
https://commitfest.postgresql.org/patch/5604/
Thanks for all your help in reviewing these changes.

Best Regards,
Andy Alsup

On Thu, Feb 27, 2025 at 1:58 PM Laurenz Albe <laurenz.albe@cybertec.at>
wrote:

Show quoted text

On Wed, 2025-02-26 at 22:11 -0500, Andy Alsup wrote:

Please find the latest patch files attached.

This is good to go. If you add it to the commitfest, I'm happy to
mark it "ready for committer".

Yours,
Laurenz Albe

#9Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andrew Alsup (#8)
Re: Update docs for UUID data type

On Thu, Feb 27, 2025 at 1:26 PM Andy Alsup <bluesbreaker@gmail.com> wrote:

I've submitted it for the up-coming commitfest. The link is: https://commitfest.postgresql.org/patch/5604/
Thanks for all your help in reviewing these changes.

Thank you for the patch!

Regarding the 0001 patch, I think we can put uuidv4() and
get_random_uuid() in the same row since they are effectively identical
functions. For example, we have precedent such as char_length() and
character_length().

The 0002 patch looks good to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#10Andrew Alsup
bluesbreaker@gmail.com
In reply to: Masahiko Sawada (#9)
Re: Update docs for UUID data type

Masahiko,

I have combined the gen_random_uuid() and uuidv4() into a single row, as
you suggested. Please find the v5 patch, which has been squashed into a
single commit.

Best regards,
Andy Alsup

On Thu, Feb 27, 2025 at 5:02 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

Show quoted text

On Thu, Feb 27, 2025 at 1:26 PM Andy Alsup <bluesbreaker@gmail.com> wrote:

I've submitted it for the up-coming commitfest. The link is:

https://commitfest.postgresql.org/patch/5604/

Thanks for all your help in reviewing these changes.

Thank you for the patch!

Regarding the 0001 patch, I think we can put uuidv4() and
get_random_uuid() in the same row since they are effectively identical
functions. For example, we have precedent such as char_length() and
character_length().

The 0002 patch looks good to me.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

v5-0001-Docs-for-UUID-funcs-formatted-in-table-and-UUID-d.patchapplication/octet-stream; name=v5-0001-Docs-for-UUID-funcs-formatted-in-table-and-UUID-d.patchDownload+155-37
#11Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andrew Alsup (#10)
Re: Update docs for UUID data type

On Thu, Feb 27, 2025 at 5:50 PM Andy Alsup <bluesbreaker@gmail.com> wrote:

Masahiko,

I have combined the gen_random_uuid() and uuidv4() into a single row, as you suggested. Please find the v5 patch, which has been squashed into a single commit.

Thank you for updating the patch!

I like that the patch adds the reference to the uuid data type. But I
think we might want to adjust terminology:

+  <para>
+   See <xref linkend="datatype-uuid"/> for how details on the UUID datatype in
+    <productname>PostgreSQL</productname>.
+  </para>

On 9.14. UUID Functions section, we use the word 'UUID' for data that
are generated based on algorithms defined by RFC9562 whereas we use
uuid (i.e., <type>uuid</type> in func.sgml) to a PostgreSQL data type.
IIUC you want to refer 'UUID datatype' in the above change to the
latter, PostgreSQL's uuid data type. Is that correct? If so, how about
the following change?

See <xref linkend="datatype-uuid"/> for details on the data type
<type>uuid</type> in <productname>PostgreSQL</productname>.

I've attached the updated patch that incorporates the above change,
and updated the commit message too.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachments:

0001-doc-Convert-UUID-functions-list-to-table-format.patchapplication/octet-stream; name=0001-doc-Convert-UUID-functions-list-to-table-format.patchDownload+155-37
#12Andrew Alsup
bluesbreaker@gmail.com
In reply to: Masahiko Sawada (#11)
Re: Update docs for UUID data type

Masahiko,

I like the change you've made.

Thanks,
Andy Alsup

On Fri, Feb 28, 2025 at 2:05 PM Masahiko Sawada <sawada.mshk@gmail.com>
wrote:

Show quoted text

On Thu, Feb 27, 2025 at 5:50 PM Andy Alsup <bluesbreaker@gmail.com> wrote:

Masahiko,

I have combined the gen_random_uuid() and uuidv4() into a single row, as

you suggested. Please find the v5 patch, which has been squashed into a
single commit.

Thank you for updating the patch!

I like that the patch adds the reference to the uuid data type. But I
think we might want to adjust terminology:

+  <para>
+   See <xref linkend="datatype-uuid"/> for how details on the UUID
datatype in
+    <productname>PostgreSQL</productname>.
+  </para>

On 9.14. UUID Functions section, we use the word 'UUID' for data that
are generated based on algorithms defined by RFC9562 whereas we use
uuid (i.e., <type>uuid</type> in func.sgml) to a PostgreSQL data type.
IIUC you want to refer 'UUID datatype' in the above change to the
latter, PostgreSQL's uuid data type. Is that correct? If so, how about
the following change?

See <xref linkend="datatype-uuid"/> for details on the data type
<type>uuid</type> in <productname>PostgreSQL</productname>.

I've attached the updated patch that incorporates the above change,
and updated the commit message too.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

#13Masahiko Sawada
sawada.mshk@gmail.com
In reply to: Andrew Alsup (#12)
Re: Update docs for UUID data type

On Fri, Feb 28, 2025 at 1:44 PM Andy Alsup <bluesbreaker@gmail.com> wrote:

Masahiko,

I like the change you've made.

Pushed.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com