XMLSerialize: version and explicit XML declaration

Started by Jim Jonesover 1 year ago10 messageshackers
Jump to latest
#1Jim Jones
jim.jones@uni-muenster.de

Hi,

I'm working on the flags VERSION (X076), INCLUDING XMLDECLARATION, and
EXCLUDING XMLDECLARATION (X078) for XMLSerialize, and I have a question
for SQL/XML experts on the list.

Is there any validation mechanism for VERSION <character string
literal>? The SQL/XML spec says

"The <character string literal> immediately contained in <XML serialize
version> shall be '1.0' or '1.1', or it shall identify some successor to
XML 1.0 and XML 1.1."

I was wondering if a validation here would make any sense, since
XMLSerialize is only supposed to print a string --- not to mention that
validating "some successor to XML 1.0 and XML 1.1" can be challenging :)
But again, printing an "invalid" XML string also doesn't seem very nice.

The oracle implementation accepts pretty much anything:

SQL> SELECT xmlserialize(DOCUMENT xmltype('<foo><bar>42</bar></foo>')
VERSION 'foo') AS xml FROM dual;

XML
--------------------------------------------------------------------------------
<?xml version="foo"?>
<foo>
  <bar>42</bar>
</foo>

In db2, anything other than '1.0' raises an error:

db2 => SELECT XMLSERIALIZE(CONTENT XMLELEMENT(NAME "db2",service_level)
AS varchar(100) VERSION '1.0' INCLUDING XMLDECLARATION) FROM
sysibmadm.env_inst_info;

1                                                                                                 
 
----------------------------------------------------------------------------------------------------
<?xml version="1.0" encoding="UTF-8"?><db2>DB2
v11.5.9.0</db2>                                      

  1 record(s) selected.

db2 => SELECT XMLSERIALIZE(CONTENT XMLELEMENT(NAME "db2",service_level)
AS varchar(100) VERSION '1.1' INCLUDING XMLDECLARATION) FROM
sysibmadm.env_inst_info;
SQL0171N  The statement was not processed because the data type, length or
value of the argument for the parameter in position "2" of routine
"XMLSERIALIZE" is incorrect. Parameter name: "".  SQLSTATE=42815

Any thoughts on how we should approach this feature?

Thanks!

Best, Jim

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#1)
Re: XMLSerialize: version and explicit XML declaration

Jim Jones <jim.jones@uni-muenster.de> writes:

Is there any validation mechanism for VERSION <character string
literal>?

AFAICS, all we do with an embedded XML version string is pass it to
libxml2's xmlNewDoc(), which is the authority on whether it means
anything. I'd be inclined to do the same here.

regards, tom lane

#3Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#2)
Re: XMLSerialize: version and explicit XML declaration

Hi Tom

On 25.09.24 18:02, Tom Lane wrote:

AFAICS, all we do with an embedded XML version string is pass it to
libxml2's xmlNewDoc(), which is the authority on whether it means
anything. I'd be inclined to do the same here.

Thanks. I used xml_is_document(), which calls xmlNewDoc(), to check if
the returned document is valid or not. It then decides if an unexpected
version deserves an error or just a warning.

Attached v1 with the first attempt to implement these features.

==== INCLUDING / EXCLUDING XMLDECLARATION (SQL/XML X078) ====

The flags INCLUDING XMLDECLARATION and EXCLUDING XMLDECLARATION include
or remove the XML declaration in the XMLSerialize output of the given
DOCUMENT or CONTENT, respectively.

SELECT
  xmlserialize(
    DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text
    INCLUDING XMLDECLARATION);

                         xmlserialize
---------------------------------------------------------------
 <?xml version="1.0" encoding="UTF8"?><foo><bar>42</bar></foo>
(1 row)

SELECT
  xmlserialize(
    DOCUMENT '<?xml version="1.0"
encoding="UTF-8"?><foo><bar>42</bar></foo>'::xml AS text
    EXCLUDING XMLDECLARATION);

       xmlserialize
--------------------------
 <foo><bar>42</bar></foo>
(1 row)

If omitted, the output will contain an XML declaration only if the given
XML value had one.

SELECT
  xmlserialize(
    DOCUMENT '<?xml version="1.0"
encoding="UTF-8"?><foo><bar>42</bar></foo>'::xml AS text);

                          xmlserialize                          
----------------------------------------------------------------
 <?xml version="1.0" encoding="UTF-8"?><foo><bar>42</bar></foo>
(1 row)

SELECT
  xmlserialize(
    DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text);
       xmlserialize       
--------------------------
 <foo><bar>42</bar></foo>
(1 row)

==== VERSION (SQL/XML X076)====

VERSION can be used to specify the version in the XML declaration of the
serialized DOCUMENT or CONTENT.

SELECT
  xmlserialize(
    DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text
    VERSION '1.0'
    INCLUDING XMLDECLARATION);
    
                         xmlserialize                          
---------------------------------------------------------------
 <?xml version="1.0" encoding="UTF8"?><foo><bar>42</bar></foo>
(1 row)

In case of XML values of type DOCUMENT, the version will be validated by
libxml2's xmlNewDoc(), which will raise an error for invalid
versions or a warning for unsupported ones. For CONTENT values no
validation is performed.

SELECT
  xmlserialize(
    DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text
    VERSION '1.1'
    INCLUDING XMLDECLARATION);
    
WARNING:  line 1: Unsupported version '1.1'
<?xml version="1.1" encoding="UTF8"?><foo><bar>42</bar></foo>
                   ^
                         xmlserialize
---------------------------------------------------------------
 <?xml version="1.1" encoding="UTF8"?><foo><bar>42</bar></foo>
(1 row)

SELECT
  xmlserialize(
    DOCUMENT '<foo><bar>42</bar></foo>'::xml AS text
    VERSION '2.0'
    INCLUDING XMLDECLARATION);

ERROR:  Invalid XML declaration: VERSION '2.0'

SELECT
  xmlserialize(
    CONTENT '<foo><bar>42</bar></foo>'::xml AS text
    VERSION '2.0'
    INCLUDING XMLDECLARATION);

                         xmlserialize
---------------------------------------------------------------
 <?xml version="2.0" encoding="UTF8"?><foo><bar>42</bar></foo>
(1 row)

This option is ignored if the XML value had no XML declaration and
INCLUDING XMLDECLARATION was not used.

SELECT
  xmlserialize(
    CONTENT '<foo><bar>42</bar></foo>'::xml AS text
    VERSION '1111');

       xmlserialize
--------------------------
 <foo><bar>42</bar></foo>
(1 row)

Best, Jim

Attachments:

v1-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchtext/x-patch; charset=UTF-8; name=v1-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchDownload+1347-51
#4Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#3)
Re: XMLSerialize: version and explicit XML declaration

On 30.09.24 10:08, Jim Jones wrote:

On 25.09.24 18:02, Tom Lane wrote:

AFAICS, all we do with an embedded XML version string is pass it to
libxml2's xmlNewDoc(), which is the authority on whether it means
anything. I'd be inclined to do the same here.

Thanks. I used xml_is_document(), which calls xmlNewDoc(), to check if
the returned document is valid or not. It then decides if an unexpected
version deserves an error or just a warning.

Attached v1 with the first attempt to implement these features.

rebase

Best regards, Jim

Attachments:

v2-0001-Add-XMLSerialize-explicit-XML-declaration-SQL-XML.patchtext/x-patch; charset=UTF-8; name=v2-0001-Add-XMLSerialize-explicit-XML-declaration-SQL-XML.patchDownload+738-33
#5Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#4)
Re: XMLSerialize: version and explicit XML declaration

rebase

Best, Jim

Attachments:

v3-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchtext/x-patch; charset=UTF-8; name=v3-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchDownload+1476-56
#6Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#4)
Re: XMLSerialize: version and explicit XML declaration

rebase and add missing check for xmlBufferAddHead result

--
Jim

Attachments:

v4-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchtext/x-patch; charset=UTF-8; name=v4-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchDownload+1487-56
#7Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#6)
Re: XMLSerialize: version and explicit XML declaration

rebase

--
Jim

Attachments:

v5-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchtext/x-patch; charset=UTF-8; name=v5-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchDownload+1490-56
#8Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#7)
Re: XMLSerialize: version and explicit XML declaration

rebased

--
Jim

Attachments:

v6-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchtext/x-patch; charset=UTF-8; name=v6-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchDownload+1490-56
#9Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#8)
Re: XMLSerialize: version and explicit XML declaration

rebased

Jim

Attachments:

v7-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchtext/x-patch; charset=UTF-8; name=v7-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchDownload+1490-56
#10Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#9)
Re: XMLSerialize: version and explicit XML declaration

rebase

Jim

Attachments:

v8-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchtext/x-patch; charset=UTF-8; name=v8-0001-Add-XMLSerialize-version-and-explicit-XML-declara.patchDownload+1559-56