Questionabl description in datatype.sgml

Started by Tatsuo Ishiiover 9 years ago5 messages
#1Tatsuo Ishii
ishii@postgresql.org

In "8.13.2. Encoding Handling"

<para>
When using binary mode to pass query parameters to the server
and query results back to the client, no character set conversion
is performed, so the situation is different. In this case, an
encoding declaration in the XML data will be observed, and if it
is absent, the data will be assumed to be in UTF-8 (as required by
the XML standard; note that PostgreSQL does not support UTF-16).
On output, data will have an encoding declaration
specifying the client encoding, unless the client encoding is
UTF-8, in which case it will be omitted.
</para>

In the first sentence shouldn't "no character set conversion" be "no
encoding conversion"? PostgreSQL is doing client/server encoding
conversion, rather than character set conversion.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tatsuo Ishii (#1)
Re: Questionabl description in datatype.sgml

Tatsuo Ishii <ishii@postgresql.org> writes:

In "8.13.2. Encoding Handling"
<para>
When using binary mode to pass query parameters to the server
and query results back to the client, no character set conversion
is performed, so the situation is different. In this case, an
encoding declaration in the XML data will be observed, and if it
is absent, the data will be assumed to be in UTF-8 (as required by
the XML standard; note that PostgreSQL does not support UTF-16).
On output, data will have an encoding declaration
specifying the client encoding, unless the client encoding is
UTF-8, in which case it will be omitted.
</para>

In the first sentence shouldn't "no character set conversion" be "no
encoding conversion"? PostgreSQL is doing client/server encoding
conversion, rather than character set conversion.

I think the text is treating "character set conversion" as meaning
the same thing as "encoding conversion"; certainly I've never seen
any place in our docs that draws a distinction between those terms.
If you think there is a difference, maybe we need to define those
terms somewhere.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#2)
Re: Questionabl description in datatype.sgml

On Sat, Jun 18, 2016 at 11:58:58AM -0400, Tom Lane wrote:

Tatsuo Ishii <ishii@postgresql.org> writes:

In "8.13.2. Encoding Handling"
<para>
When using binary mode to pass query parameters to the server
and query results back to the client, no character set conversion
is performed, so the situation is different. In this case, an
encoding declaration in the XML data will be observed, and if it
is absent, the data will be assumed to be in UTF-8 (as required by
the XML standard; note that PostgreSQL does not support UTF-16).
On output, data will have an encoding declaration
specifying the client encoding, unless the client encoding is
UTF-8, in which case it will be omitted.
</para>

In the first sentence shouldn't "no character set conversion" be "no
encoding conversion"? PostgreSQL is doing client/server encoding
conversion, rather than character set conversion.

I think the text is treating "character set conversion" as meaning
the same thing as "encoding conversion"; certainly I've never seen
any place in our docs that draws a distinction between those terms.
If you think there is a difference, maybe we need to define those
terms somewhere.

Uh, I think Unicode is a character set, and UTF8 is an encoding. I
think Tatsuo is right here.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tatsuo Ishii
ishii@postgresql.org
In reply to: Bruce Momjian (#3)
Re: Questionabl description in datatype.sgml

On Sat, Jun 18, 2016 at 11:58:58AM -0400, Tom Lane wrote:

Tatsuo Ishii <ishii@postgresql.org> writes:

In "8.13.2. Encoding Handling"
<para>
When using binary mode to pass query parameters to the server
and query results back to the client, no character set conversion
is performed, so the situation is different. In this case, an
encoding declaration in the XML data will be observed, and if it
is absent, the data will be assumed to be in UTF-8 (as required by
the XML standard; note that PostgreSQL does not support UTF-16).
On output, data will have an encoding declaration
specifying the client encoding, unless the client encoding is
UTF-8, in which case it will be omitted.
</para>

In the first sentence shouldn't "no character set conversion" be "no
encoding conversion"? PostgreSQL is doing client/server encoding
conversion, rather than character set conversion.

I think the text is treating "character set conversion" as meaning
the same thing as "encoding conversion"; certainly I've never seen
any place in our docs that draws a distinction between those terms.
If you think there is a difference, maybe we need to define those
terms somewhere.

Uh, I think Unicode is a character set, and UTF8 is an encoding. I
think Tatsuo is right here.

Yes, a character set is different from an encoding. I though it's a
common understanding among people.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Bruce Momjian
bruce@momjian.us
In reply to: Tatsuo Ishii (#4)
1 attachment(s)
Re: Questionabl description in datatype.sgml

On Fri, Jun 24, 2016 at 07:27:24AM +0900, Tatsuo Ishii wrote:

On Sat, Jun 18, 2016 at 11:58:58AM -0400, Tom Lane wrote:

Tatsuo Ishii <ishii@postgresql.org> writes:

In "8.13.2. Encoding Handling"
<para>
When using binary mode to pass query parameters to the server
and query results back to the client, no character set conversion
is performed, so the situation is different. In this case, an
encoding declaration in the XML data will be observed, and if it
is absent, the data will be assumed to be in UTF-8 (as required by
the XML standard; note that PostgreSQL does not support UTF-16).
On output, data will have an encoding declaration
specifying the client encoding, unless the client encoding is
UTF-8, in which case it will be omitted.
</para>

In the first sentence shouldn't "no character set conversion" be "no
encoding conversion"? PostgreSQL is doing client/server encoding
conversion, rather than character set conversion.

I think the text is treating "character set conversion" as meaning
the same thing as "encoding conversion"; certainly I've never seen
any place in our docs that draws a distinction between those terms.
If you think there is a difference, maybe we need to define those
terms somewhere.

Uh, I think Unicode is a character set, and UTF8 is an encoding. I
think Tatsuo is right here.

Yes, a character set is different from an encoding. I though it's a
common understanding among people.

Fixed with the attached applied patch.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+                     Ancient Roman grave inscription +

Attachments:

encoding.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
new file mode 100644
index 11e246f..9643746
*** a/doc/src/sgml/datatype.sgml
--- b/doc/src/sgml/datatype.sgml
*************** SET xmloption TO { DOCUMENT | CONTENT };
*** 4219,4225 ****
  
     <para>
      When using binary mode to pass query parameters to the server
!     and query results back to the client, no character set conversion
      is performed, so the situation is different.  In this case, an
      encoding declaration in the XML data will be observed, and if it
      is absent, the data will be assumed to be in UTF-8 (as required by
--- 4219,4225 ----
  
     <para>
      When using binary mode to pass query parameters to the server
!     and query results back to the client, no encoding conversion
      is performed, so the situation is different.  In this case, an
      encoding declaration in the XML data will be observed, and if it
      is absent, the data will be assumed to be in UTF-8 (as required by