Add XMLNamespaces to XMLElement

Started by Jim Jonesover 1 year ago24 messageshackers
Jump to latest
#1Jim Jones
jim.jones@uni-muenster.de

Hi,

I'd like to propose the implementation of the XMLNamespaces option for
XMLElement.

XMLNAMESPACES(nsuri AS nsprefix)
XMLNAMESPACES(DEFAULT default-nsuri)
XMLNAMESPACES(NO DEFAULT)

* nsprefix:              Namespace's prefix.
* nsuri:                 Namespace's URI.
* DEFAULT default-nsuri: Specifies the DEFAULT namespace to use within
the scope of a namespace declaration.
* NO DEFAULT:            Specifies that NO DEFAULT namespace is to be
used within the scope of a namespace declaration.

This basically works pretty much like XMLAttributes, but with a few more
restrictions (see SQL/XML:2023, 11.2 <XML lexically scoped options>):

* XML namespace declaration shall contain at most one DEFAULT namespace
declaration item.
* No namespace prefix shall be equivalent to xml or xmlns.
* No namespace URI shall be identical to http://www.w3.org/2000/xmlns/
or to http://www.w3.org/XML/1998/namespace.
* The value of a namespace URI contained in an regular namespace
declaration item (no DEFAULT) shall not be a zero-length string.

Examples:

SELECT xmlelement(NAME "foo", xmlnamespaces('http://x.y&#39; AS bar));
          xmlelement           
-------------------------------
 <foo xmlns:bar="http://x.y&quot;/&gt;

SELECT xmlelement(NAME "foo", xmlnamespaces(DEFAULT 'http://x.y&#39;));
        xmlelement         
---------------------------
 <foo xmlns="http://x.y&quot;/&gt;

SELECT xmlelement(NAME "foo", xmlnamespaces(NO DEFAULT));
   xmlelement    
-----------------
 <foo xmlns=""/>

In transformXmlExpr() it seemed convenient to use the same parameters to
store the prefixes and URIs as in XMLAttributes (arg_names and
named_args), but I am still not so sure it is the right approach. Is
there perhaps a better way?

Any thoughts? Feedback welcome!

Best, Jim

Attachments:

v1-0001-Add-XMLNamespaces-option-to-XMLElement.patchtext/x-patch; charset=UTF-8; name=v1-0001-Add-XMLNamespaces-option-to-XMLElement.patchDownload+884-23
#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jim Jones (#1)
Re: Add XMLNamespaces to XMLElement

Hi

so 21. 12. 2024 v 0:51 odesílatel Jim Jones <jim.jones@uni-muenster.de>
napsal:

Hi,

I'd like to propose the implementation of the XMLNamespaces option for
XMLElement.

XMLNAMESPACES(nsuri AS nsprefix)
XMLNAMESPACES(DEFAULT default-nsuri)
XMLNAMESPACES(NO DEFAULT)

* nsprefix: Namespace's prefix.
* nsuri: Namespace's URI.
* DEFAULT default-nsuri: Specifies the DEFAULT namespace to use within
the scope of a namespace declaration.
* NO DEFAULT: Specifies that NO DEFAULT namespace is to be
used within the scope of a namespace declaration.

This basically works pretty much like XMLAttributes, but with a few more
restrictions (see SQL/XML:2023, 11.2 <XML lexically scoped options>):

* XML namespace declaration shall contain at most one DEFAULT namespace
declaration item.
* No namespace prefix shall be equivalent to xml or xmlns.
* No namespace URI shall be identical to http://www.w3.org/2000/xmlns/
or to http://www.w3.org/XML/1998/namespace.
* The value of a namespace URI contained in an regular namespace
declaration item (no DEFAULT) shall not be a zero-length string.

Examples:

SELECT xmlelement(NAME "foo", xmlnamespaces('http://x.y&#39; AS bar));
xmlelement
-------------------------------
<foo xmlns:bar="http://x.y&quot;/&gt;

SELECT xmlelement(NAME "foo", xmlnamespaces(DEFAULT 'http://x.y&#39;));
xmlelement
---------------------------
<foo xmlns="http://x.y&quot;/&gt;

SELECT xmlelement(NAME "foo", xmlnamespaces(NO DEFAULT));
xmlelement
-----------------
<foo xmlns=""/>

In transformXmlExpr() it seemed convenient to use the same parameters to
store the prefixes and URIs as in XMLAttributes (arg_names and
named_args), but I am still not so sure it is the right approach. Is
there perhaps a better way?

Any thoughts? Feedback welcome!

+1

Pavel

Show quoted text

Best, Jim

#3Umar Hayat
postgresql.wizard@gmail.com
In reply to: Pavel Stehule (#2)
Re: Add XMLNamespaces to XMLElement

Hi,
+1 for the enhancement.

I haven't compiled and reviewed the full patch yet, please see a few
comments from my side based on static analysis.

1. Though changes are targeted for XMLNAMESPACES for XMLElement but in
my opinion it will affect XMLTABLE as well because the
'xml_namespace_list' rule is shared now.
Adding 'NO DEFAULT' in xml_namespace_list will allow users to use it
with XMLTABLE XMLNAMESPACES as well.PostgreSQL grammar allow to
specify DEFAULT in NAMESPACE but resulting in following error:
"ERROR: DEFAULT namespace is not supported"
What would be behavior with this change for XMLTABLE, should this be
allowed and the error messages need to be updated (may be this will
not be an error at all) or we need to restrict users to not use 'NO
DEFAULT' with XMLTable.

2. Should we reuse the 'xml_namespaces' rule for XMLTable, as the
definition is the same.

3. In this patch 'NO DEFAULT' behavior is like DEFAULT '<blank>'
(empty uri) , should not it be more like 'DEFAULT NULL' to result in
the following ?
SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
xmlelement
------------------
<root/>

instead of

SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
xmlelement
------------------
<root xmlns=""/>

Regards
Umar Hayat

On Sat, 21 Dec 2024 at 14:57, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Hi

so 21. 12. 2024 v 0:51 odesílatel Jim Jones <jim.jones@uni-muenster.de> napsal:

Hi,

I'd like to propose the implementation of the XMLNamespaces option for
XMLElement.

XMLNAMESPACES(nsuri AS nsprefix)
XMLNAMESPACES(DEFAULT default-nsuri)
XMLNAMESPACES(NO DEFAULT)

* nsprefix: Namespace's prefix.
* nsuri: Namespace's URI.
* DEFAULT default-nsuri: Specifies the DEFAULT namespace to use within
the scope of a namespace declaration.
* NO DEFAULT: Specifies that NO DEFAULT namespace is to be
used within the scope of a namespace declaration.

This basically works pretty much like XMLAttributes, but with a few more
restrictions (see SQL/XML:2023, 11.2 <XML lexically scoped options>):

* XML namespace declaration shall contain at most one DEFAULT namespace
declaration item.
* No namespace prefix shall be equivalent to xml or xmlns.
* No namespace URI shall be identical to http://www.w3.org/2000/xmlns/
or to http://www.w3.org/XML/1998/namespace.
* The value of a namespace URI contained in an regular namespace
declaration item (no DEFAULT) shall not be a zero-length string.

Examples:

SELECT xmlelement(NAME "foo", xmlnamespaces('http://x.y&#39; AS bar));
xmlelement
-------------------------------
<foo xmlns:bar="http://x.y&quot;/&gt;

SELECT xmlelement(NAME "foo", xmlnamespaces(DEFAULT 'http://x.y&#39;));
xmlelement
---------------------------
<foo xmlns="http://x.y&quot;/&gt;

SELECT xmlelement(NAME "foo", xmlnamespaces(NO DEFAULT));
xmlelement
-----------------
<foo xmlns=""/>

In transformXmlExpr() it seemed convenient to use the same parameters to
store the prefixes and URIs as in XMLAttributes (arg_names and
named_args), but I am still not so sure it is the right approach. Is
there perhaps a better way?

Any thoughts? Feedback welcome!

+1

Pavel

Best, Jim

--
Umar Hayat
Bitnine (https://bitnine.net/)

#4Jim Jones
jim.jones@uni-muenster.de
In reply to: Umar Hayat (#3)
Re: Add XMLNamespaces to XMLElement

Hi Umar, hi Pavel,

Thanks for taking a look at this patch!

On 26.12.24 05:15, Umar Hayat wrote:

Hi,
+1 for the enhancement.

I haven't compiled and reviewed the full patch yet, please see a few
comments from my side based on static analysis.

1. Though changes are targeted for XMLNAMESPACES for XMLElement but in
my opinion it will affect XMLTABLE as well because the
'xml_namespace_list' rule is shared now.
Adding 'NO DEFAULT' in xml_namespace_list will allow users to use it
with XMLTABLE XMLNAMESPACES as well.PostgreSQL grammar allow to
specify DEFAULT in NAMESPACE but resulting in following error:
"ERROR: DEFAULT namespace is not supported"

I also considered creating a new rule to avoid any conflict with
XMLTable, but as it didn't break any regression test and the result
would be pretty much the same as with "DEFAULT 'str'", I thought that
extending the existing rule would be the way to go.

SELECT * FROM XMLTABLE(XMLNAMESPACES(NO DEFAULT),
                      '/rows/row'
                      PASSING '<rows
xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;
                      COLUMNS a int PATH 'a');
ERROR:  DEFAULT namespace is not supported

What do you think?

What would be behavior with this change for XMLTABLE, should this be
allowed and the error messages need to be updated (may be this will
not be an error at all) or we need to restrict users to not use 'NO
DEFAULT' with XMLTable.

Perhaps updating the error message would suffice?

2. Should we reuse the 'xml_namespaces' rule for XMLTable, as the
definition is the same.

That would be good. I'm just afraid it would deviate a bit from the
scope of this patch - here I mean touching other function. Would you
suggest to add it to a patch series?

3. In this patch 'NO DEFAULT' behavior is like DEFAULT '<blank>'
(empty uri) , should not it be more like 'DEFAULT NULL' to result in
the following ?
SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
xmlelement
------------------
<root/>

instead of

SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
xmlelement
------------------
<root xmlns=""/>

The idea of NO DEFAULT is pretty much to free an element (and its
children) from a previous DEFAULT in the same scope.

SELECT
  xmlserialize(DOCUMENT
    xmlelement(NAME "root",
      xmlnamespaces(DEFAULT 'http:/x.y/ns1'),
      xmlelement(NAME "foo",
        xmlnamespaces(NO DEFAULT))
  ) AS text INDENT);

         xmlserialize         
------------------------------
 <root xmlns="http:/x.y/ns1">+
   <foo xmlns=""/>           +
 </root>
(1 row)

I believe this behaviour might be confusing if NO DEFAULT is used in the
root element, as it there is no previously declared namespace. Perhaps
making NO DEFAULT behave like DEFAULT NULL only in the root element
would make things clearer? The SQL/XML spec doesn't say anything
specific about it, but DB2 had the same thought[1]. For reference, here
are the regress tests[2] of this patch tested with the DB2 implementation.

On Sat, 21 Dec 2024 at 14:57, Pavel Stehule <pavel.stehule@gmail.com> wrote:

+1

Pavel

rebase in v2 attached - due to changes in gram.y

Thanks a lot

Best, Jim

1 - https://dbfiddle.uk/0QsWlfZR
2 - https://dbfiddle.uk/SyiDfXod

Attachments:

v2-0001-Add-XMLNamespaces-option-to-XMLElement.patchtext/x-patch; charset=UTF-8; name=v2-0001-Add-XMLNamespaces-option-to-XMLElement.patchDownload+884-23
#5Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jim Jones (#4)
Re: Add XMLNamespaces to XMLElement

čt 26. 12. 2024 v 14:46 odesílatel Jim Jones <jim.jones@uni-muenster.de>
napsal:

Hi Umar, hi Pavel,

Thanks for taking a look at this patch!

On 26.12.24 05:15, Umar Hayat wrote:

Hi,
+1 for the enhancement.

I haven't compiled and reviewed the full patch yet, please see a few
comments from my side based on static analysis.

1. Though changes are targeted for XMLNAMESPACES for XMLElement but in
my opinion it will affect XMLTABLE as well because the
'xml_namespace_list' rule is shared now.
Adding 'NO DEFAULT' in xml_namespace_list will allow users to use it
with XMLTABLE XMLNAMESPACES as well.PostgreSQL grammar allow to
specify DEFAULT in NAMESPACE but resulting in following error:
"ERROR: DEFAULT namespace is not supported"

I also considered creating a new rule to avoid any conflict with
XMLTable, but as it didn't break any regression test and the result
would be pretty much the same as with "DEFAULT 'str'", I thought that
extending the existing rule would be the way to go.

SELECT * FROM XMLTABLE(XMLNAMESPACES(NO DEFAULT),
'/rows/row'
PASSING '<rows
xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;
COLUMNS a int PATH 'a');
ERROR: DEFAULT namespace is not supported

What do you think?

What would be behavior with this change for XMLTABLE, should this be
allowed and the error messages need to be updated (may be this will
not be an error at all) or we need to restrict users to not use 'NO
DEFAULT' with XMLTable.

Perhaps updating the error message would suffice?

2. Should we reuse the 'xml_namespaces' rule for XMLTable, as the
definition is the same.

That would be good. I'm just afraid it would deviate a bit from the
scope of this patch - here I mean touching other function. Would you
suggest to add it to a patch series?

3. In this patch 'NO DEFAULT' behavior is like DEFAULT '<blank>'
(empty uri) , should not it be more like 'DEFAULT NULL' to result in
the following ?
SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
xmlelement
------------------
<root/>

instead of

SELECT xmlelement(NAME "root", xmlnamespaces(NO DEFAULT));
xmlelement
------------------
<root xmlns=""/>

The idea of NO DEFAULT is pretty much to free an element (and its
children) from a previous DEFAULT in the same scope.

SELECT
xmlserialize(DOCUMENT
xmlelement(NAME "root",
xmlnamespaces(DEFAULT 'http:/x.y/ns1'),
xmlelement(NAME "foo",
xmlnamespaces(NO DEFAULT))
) AS text INDENT);

xmlserialize
------------------------------
<root xmlns="http:/x.y/ns1">+
<foo xmlns=""/> +
</root>
(1 row)

I believe this behaviour might be confusing if NO DEFAULT is used in the
root element, as it there is no previously declared namespace. Perhaps
making NO DEFAULT behave like DEFAULT NULL only in the root element
would make things clearer? The SQL/XML spec doesn't say anything
specific about it, but DB2 had the same thought[1]. For reference, here
are the regress tests[2] of this patch tested with the DB2 implementation.

You can check Oracle too.

On Sat, 21 Dec 2024 at 14:57, Pavel Stehule <pavel.stehule@gmail.com>

wrote:

+1

Pavel

rebase in v2 attached - due to changes in gram.y

I checked this patch

The parser part looks a little bit dirty - it multiplies numbers of
XMLELEMENT rules. Maybe xmlattributes and xml_namespaces can be processed
elsewhere like list of xml_element_options?

Regards

Pavel

Show quoted text

Thanks a lot

Best, Jim

1 - https://dbfiddle.uk/0QsWlfZR
2 - https://dbfiddle.uk/SyiDfXod

#6Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#4)
Re: Add XMLNamespaces to XMLElement

Hi Umar, Hi Pavel,

On 26.12.24 14:46, Jim Jones wrote:

The idea of NO DEFAULT is pretty much to free an element (and its
children) from a previous DEFAULT in the same scope.

SELECT
  xmlserialize(DOCUMENT
    xmlelement(NAME "root",
      xmlnamespaces(DEFAULT 'http:/x.y/ns1'),
      xmlelement(NAME "foo",
        xmlnamespaces(NO DEFAULT))
  ) AS text INDENT);

         xmlserialize         
------------------------------
 <root xmlns="http:/x.y/ns1">+
   <foo xmlns=""/>           +
 </root>
(1 row)

v3 is attached, now using xmlTextWriterWriteAttributeNS from libxml2 for
managing XML namespaces, instead of using xmlTextWriterWriteAttribute.
Libxml2 is quite lenient, allowing the duplication of default namespaces
within the same scope and even permitting NO DEFAULT namespaces when no
previous DEFAULT declaration has been made - both are semantically valid.

The crux now is finding the appropriate balance between accuracy and
user intent. In the context of PostgreSQL's xmlelement and
xmlnamespaces, I would argue that explicitly declared namespaces,
redundant or not, ought to be preserved. A user who intentionally
repeats a namespace declaration might have sound reasons for doing so,
like ensuring clarity, preserving compatibility with external XML
processors, or sticking to a specific schema. Silently omitting these
declarations could lead to confusion.

Pavel has tidied up the parser modifications - it's looking much neater
now. Many thanks for that!

Best, Jim

Attachments:

v3-0001-Add-XMLNamespaces-option-to-XMLElement.patchtext/x-patch; charset=UTF-8; name=v3-0001-Add-XMLNamespaces-option-to-XMLElement.patchDownload+1032-23
#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jim Jones (#6)
Re: Add XMLNamespaces to XMLElement

Hi

st 15. 1. 2025 v 21:35 odesílatel Jim Jones <jim.jones@uni-muenster.de>
napsal:

Hi Umar, Hi Pavel,

On 26.12.24 14:46, Jim Jones wrote:

The idea of NO DEFAULT is pretty much to free an element (and its
children) from a previous DEFAULT in the same scope.

SELECT
xmlserialize(DOCUMENT
xmlelement(NAME "root",
xmlnamespaces(DEFAULT 'http:/x.y/ns1'),
xmlelement(NAME "foo",
xmlnamespaces(NO DEFAULT))
) AS text INDENT);

xmlserialize
------------------------------
<root xmlns="http:/x.y/ns1">+
<foo xmlns=""/> +
</root>
(1 row)

v3 is attached, now using xmlTextWriterWriteAttributeNS from libxml2 for
managing XML namespaces, instead of using xmlTextWriterWriteAttribute.
Libxml2 is quite lenient, allowing the duplication of default namespaces
within the same scope and even permitting NO DEFAULT namespaces when no
previous DEFAULT declaration has been made - both are semantically valid.

The crux now is finding the appropriate balance between accuracy and
user intent. In the context of PostgreSQL's xmlelement and
xmlnamespaces, I would argue that explicitly declared namespaces,
redundant or not, ought to be preserved. A user who intentionally
repeats a namespace declaration might have sound reasons for doing so,
like ensuring clarity, preserving compatibility with external XML
processors, or sticking to a specific schema. Silently omitting these
declarations could lead to confusion.

Now, I have not any objections against the code

The patch has doc and enough regress tests
The patching and compilation without problems
make check-world passed

I'll mark this patch as ready for committer

Pavel has tidied up the parser modifications - it's looking much neater
now. Many thanks for that!

It was a pleasure

Regards

Pavel

Show quoted text

Best, Jim

#8Jim Jones
jim.jones@uni-muenster.de
In reply to: Pavel Stehule (#7)
Re: Add XMLNamespaces to XMLElement

On 16.01.25 08:36, Pavel Stehule wrote:

Now, I have not any objections against the code

The patch has doc and enough regress tests
The patching and compilation without problems
make check-world passed

I'll mark this patch as ready for committer

rebase due to gram.y changes introduced in 80feb72

--
Jim

Attachments:

v4-0001-Add-XMLNamespaces-option-to-XMLElement.patchtext/x-patch; charset=UTF-8; name=v4-0001-Add-XMLNamespaces-option-to-XMLElement.patchDownload+1032-23
#9Umar Hayat
postgresql.wizard@gmail.com
In reply to: Jim Jones (#8)
Re: Add XMLNamespaces to XMLElement

Hi Jim & Pavel,
Sorry for getting back a bit late on this. Few more case you might
need consider:

As I mentioned in my first static review about XMLTable existing
behaviour might change, I give it a run time review and here are few
findings:

1. Because of this patch XMLTable namespace will accept NO DEFAULT
value, I was expecting an error message based on prior behavior '' but
If I run following query it show something different:
"

SELECT * FROM XMLTABLE(XMLNAMESPACES(NO DEFAULT),

'/rows/row'
PASSING '<rows
xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;
COLUMNS a int PATH 'a');
ERROR: cache lookup failed for type 0
"
Can you please re-check this behaviour (there might be something wrong
with my build environment)

2. Seems like namespace as ColumnRef was already allowed (I doubt if
that's a correct implementation, I see other DB allows only strings
AFAIK), for non-default namespaces may be its fair, should this be
allowed for default namespace, any opinion.
create table tbl1 ( col1 text);
insert into tbl1 values ('abc');
insert into tbl1 values ('def');
select xmlelement(NAME "root", xmlnamespaces(default col1)) from tbl1;
xmlelement
---------------------
<root xmlns="abc"/>
<root xmlns="def"/>
(2 rows)

Changing status in commitfest, feel free to revert back.

Regards
Umar Hayat

On Sun, 19 Jan 2025 at 18:33, Jim Jones <jim.jones@uni-muenster.de> wrote:

On 16.01.25 08:36, Pavel Stehule wrote:

Now, I have not any objections against the code

The patch has doc and enough regress tests
The patching and compilation without problems
make check-world passed

I'll mark this patch as ready for committer

rebase due to gram.y changes introduced in 80feb72

--
Jim

--
Umar Hayat
Bitnine (https://bitnine.net/)

#10Jim Jones
jim.jones@uni-muenster.de
In reply to: Umar Hayat (#9)
Re: Add XMLNamespaces to XMLElement

Hi Umar

On 20.01.25 17:19, Umar Hayat wrote:

Hi Jim & Pavel,
Sorry for getting back a bit late on this. Few more case you might
need consider:

As I mentioned in my first static review about XMLTable existing
behaviour might change, I give it a run time review and here are few
findings:

1. Because of this patch XMLTable namespace will accept NO DEFAULT
value, I was expecting an error message based on prior behavior '' but
If I run following query it show something different:
"

SELECT * FROM XMLTABLE(XMLNAMESPACES(NO DEFAULT),

'/rows/row'
PASSING '<rows
xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;
COLUMNS a int PATH 'a');
ERROR: cache lookup failed for type 0
"
Can you please re-check this behaviour (there might be something wrong
with my build environment)

There is nothing wrong with your environment :) I forgot to test
XMLTable for NO DEFAULT namespaces. To be consistent with the existing
code, I added the following condition to transformRangeTableFunc:

if (!r->name && !r->val)
  ns_uri = transformExpr(pstate, makeStringConst("", r->location),
EXPR_KIND_FROM_FUNCTION);
else
ns_uri = transformExpr(pstate, r->val, EXPR_KIND_FROM_FUNCTION);

It adds an empty string to the uri, which is the value expected for NO
DEFAULT.

I also added a NO DEFAULT test for XMLTable in xml.sql

v5 attached.

2. Seems like namespace as ColumnRef was already allowed (I doubt if
that's a correct implementation, I see other DB allows only strings
AFAIK), for non-default namespaces may be its fair, should this be
allowed for default namespace, any opinion.
create table tbl1 ( col1 text);
insert into tbl1 values ('abc');
insert into tbl1 values ('def');
select xmlelement(NAME "root", xmlnamespaces(default col1)) from tbl1;
xmlelement
---------------------
<root xmlns="abc"/>
<root xmlns="def"/>
(2 rows)

What are your concerns about supporting ColumnRef for the URI's?

It is currently supported by XMLAttributes:

CREATE TABLE t AS SELECT 'http://x.y&#39; AS uri;
SELECT xmlelement(NAME el, xmlattributes("uri" AS att)) FROM t;

       xmlelement       
------------------------
 <el att="http://x.y&quot;/&gt;
(1 row)

Many thanks for the review!

Best, Jim

--
Jim

Attachments:

v5-0001-Add-XMLNamespaces-option-to-XMLElement.patchtext/x-patch; charset=UTF-8; name=v5-0001-Add-XMLNamespaces-option-to-XMLElement.patchDownload+1066-25
#11Umar Hayat
postgresql.wizard@gmail.com
In reply to: Jim Jones (#10)
Re: Add XMLNamespaces to XMLElement

On Tue, 21 Jan 2025 at 04:04, Jim Jones <jim.jones@uni-muenster.de> wrote:

Hi Umar

On 20.01.25 17:19, Umar Hayat wrote:

Hi Jim & Pavel,
Sorry for getting back a bit late on this. Few more case you might
need consider:

As I mentioned in my first static review about XMLTable existing
behaviour might change, I give it a run time review and here are few
findings:

1. Because of this patch XMLTable namespace will accept NO DEFAULT
value, I was expecting an error message based on prior behavior '' but
If I run following query it show something different:
"

SELECT * FROM XMLTABLE(XMLNAMESPACES(NO DEFAULT),

'/rows/row'
PASSING '<rows
xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;
COLUMNS a int PATH 'a');
ERROR: cache lookup failed for type 0
"
Can you please re-check this behaviour (there might be something wrong
with my build environment)

There is nothing wrong with your environment :) I forgot to test
XMLTable for NO DEFAULT namespaces. To be consistent with the existing
code, I added the following condition to transformRangeTableFunc:

if (!r->name && !r->val)
ns_uri = transformExpr(pstate, makeStringConst("", r->location),
EXPR_KIND_FROM_FUNCTION);
else
ns_uri = transformExpr(pstate, r->val, EXPR_KIND_FROM_FUNCTION);

It adds an empty string to the uri, which is the value expected for NO
DEFAULT.

I also added a NO DEFAULT test for XMLTable in xml.sql

v5 attached.

2. Seems like namespace as ColumnRef was already allowed (I doubt if
that's a correct implementation, I see other DB allows only strings
AFAIK), for non-default namespaces may be its fair, should this be
allowed for default namespace, any opinion.
create table tbl1 ( col1 text);
insert into tbl1 values ('abc');
insert into tbl1 values ('def');
select xmlelement(NAME "root", xmlnamespaces(default col1)) from tbl1;
xmlelement
---------------------
<root xmlns="abc"/>
<root xmlns="def"/>
(2 rows)

What are your concerns about supporting ColumnRef for the URI's?

It is currently supported by XMLAttributes:

For XMLAttributes attribute it should have ColumnRef/Expr because
that's the data/content we want to generate. But namespaces and xml
tags, IMO they should be considered as part of the structure/schema of
XML. Allowing namespaces (default or otherwise) to be generated
arbitrarily for each record does not seem correct to me, it's like
generating arbitrary XML using print string which does not require XML
functions.

- DB2 allows XMLElements namespace but it does not allow Expr/ColumnRef.
- Oracle Allow namespace in only XMLTable, and it does not allow Expr/ColumnRef.

- Having SConst/String or numeric can limit the error handling at
parsing stage which can validate the schema instead of expression
evaluation per record, which leads to following problem at runtime:

CREATE TABLE xmltab (uri TEXT);
INSERT INTO xmltab VALUES ('good'), ('');
SELECT XMLElement(NAME "root", XMLNamespaces(uri AS zz)) from xmltab;
ERROR: invalid XML namespace URI for "zz"
DETAIL: a regular XML namespace cannot be a zero-length string

Imagine there millions of records and in the middle it fails.

- Also expression evaluation per record and doing validation (valid
URI etc) I believe will have some performance impact ( I haven't
tested it, so can't say for sure , how much )

- It's possible that there might be some other angle that I am missing
right now, maybe Pavel/Àlvaro can shed light on this. I briefly went
through the first implementation thread. There was a bit of discussion
to use b_expr instead of c_expr, but I have yet to explore the
reasoning of not using SConst.

CREATE TABLE t AS SELECT 'http://x.y&#39; AS uri;
SELECT xmlelement(NAME el, xmlattributes("uri" AS att)) FROM t;

xmlelement
------------------------
<el att="http://x.y&quot;/&gt;
(1 row)

Many thanks for the review!

Best, Jim

--
Jim

--
Umar Hayat
Bitnine (https://bitnine.net/)

#12Pavel Stehule
pavel.stehule@gmail.com
In reply to: Umar Hayat (#11)
Re: Add XMLNamespaces to XMLElement

Hi

út 21. 1. 2025 v 11:45 odesílatel Umar Hayat <postgresql.wizard@gmail.com>
napsal:

On Tue, 21 Jan 2025 at 04:04, Jim Jones <jim.jones@uni-muenster.de> wrote:

Hi Umar

On 20.01.25 17:19, Umar Hayat wrote:

Hi Jim & Pavel,
Sorry for getting back a bit late on this. Few more case you might
need consider:

As I mentioned in my first static review about XMLTable existing
behaviour might change, I give it a run time review and here are few
findings:

1. Because of this patch XMLTable namespace will accept NO DEFAULT
value, I was expecting an error message based on prior behavior '' but
If I run following query it show something different:
"

SELECT * FROM XMLTABLE(XMLNAMESPACES(NO DEFAULT),

'/rows/row'
PASSING '<rows
xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;
COLUMNS a int PATH 'a');
ERROR: cache lookup failed for type 0
"
Can you please re-check this behaviour (there might be something wrong
with my build environment)

There is nothing wrong with your environment :) I forgot to test
XMLTable for NO DEFAULT namespaces. To be consistent with the existing
code, I added the following condition to transformRangeTableFunc:

if (!r->name && !r->val)
ns_uri = transformExpr(pstate, makeStringConst("", r->location),
EXPR_KIND_FROM_FUNCTION);
else
ns_uri = transformExpr(pstate, r->val, EXPR_KIND_FROM_FUNCTION);

It adds an empty string to the uri, which is the value expected for NO
DEFAULT.

I also added a NO DEFAULT test for XMLTable in xml.sql

v5 attached.

2. Seems like namespace as ColumnRef was already allowed (I doubt if
that's a correct implementation, I see other DB allows only strings
AFAIK), for non-default namespaces may be its fair, should this be
allowed for default namespace, any opinion.
create table tbl1 ( col1 text);
insert into tbl1 values ('abc');
insert into tbl1 values ('def');
select xmlelement(NAME "root", xmlnamespaces(default col1)) from tbl1;
xmlelement
---------------------
<root xmlns="abc"/>
<root xmlns="def"/>
(2 rows)

What are your concerns about supporting ColumnRef for the URI's?

It is currently supported by XMLAttributes:

For XMLAttributes attribute it should have ColumnRef/Expr because
that's the data/content we want to generate. But namespaces and xml
tags, IMO they should be considered as part of the structure/schema of
XML. Allowing namespaces (default or otherwise) to be generated
arbitrarily for each record does not seem correct to me, it's like
generating arbitrary XML using print string which does not require XML
functions.

- DB2 allows XMLElements namespace but it does not allow Expr/ColumnRef.
- Oracle Allow namespace in only XMLTable, and it does not allow
Expr/ColumnRef.

- Having SConst/String or numeric can limit the error handling at
parsing stage which can validate the schema instead of expression
evaluation per record, which leads to following problem at runtime:

CREATE TABLE xmltab (uri TEXT);
INSERT INTO xmltab VALUES ('good'), ('');
SELECT XMLElement(NAME "root", XMLNamespaces(uri AS zz)) from xmltab;
ERROR: invalid XML namespace URI for "zz"
DETAIL: a regular XML namespace cannot be a zero-length string

Imagine there millions of records and in the middle it fails.

- Also expression evaluation per record and doing validation (valid
URI etc) I believe will have some performance impact ( I haven't
tested it, so can't say for sure , how much )

- It's possible that there might be some other angle that I am missing
right now, maybe Pavel/Àlvaro can shed light on this. I briefly went
through the first implementation thread. There was a bit of discussion
to use b_expr instead of c_expr, but I have yet to explore the
reasoning of not using SConst.

Postgres usually doesn't force string constants in functions or
pseudofunctions arguments.

I remember a discussion related to string_agg, where the field separator is
not constant too, although it makes sense there.
I think there are more cases - probably aggregate (maybe window) functions
where postgres allows any expression and other databases allow just
constants there.

Personally, using only string constants can be sometimes too messy (when I
use other db).

Regards

Pavel

Show quoted text

CREATE TABLE t AS SELECT 'http://x.y&#39; AS uri;
SELECT xmlelement(NAME el, xmlattributes("uri" AS att)) FROM t;

xmlelement
------------------------
<el att="http://x.y&quot;/&gt;
(1 row)

Many thanks for the review!

Best, Jim

--
Jim

--
Umar Hayat
Bitnine (https://bitnine.net/)

#13Jim Jones
jim.jones@uni-muenster.de
In reply to: Umar Hayat (#11)
Re: Add XMLNamespaces to XMLElement

On 21.01.25 11:48, Umar Hayat wrote:

For XMLAttributes attribute it should have ColumnRef/Expr because
that's the data/content we want to generate. But namespaces and xml
tags, IMO they should be considered as part of the structure/schema of
XML. Allowing namespaces (default or otherwise) to be generated
arbitrarily for each record does not seem correct to me, it's like
generating arbitrary XML using print string which does not require XML
functions.

I'm not sure I completely get your 'print string' argument. Namespaces
are added to the element using libxml2's xmlTextWriterWriteAttributeNS
function, so we’re letting libxml2 handle whether a namespace
declaration is valid or not. Of course, there are still some extra
checks required by the SQL/XML standard.

- DB2 allows XMLElements namespace but it does not allow Expr/ColumnRef.
- Oracle Allow namespace in only XMLTable, and it does not allow Expr/ColumnRef.

- Having SConst/String or numeric can limit the error handling at
parsing stage which can validate the schema instead of expression
evaluation per record, which leads to following problem at runtime:

CREATE TABLE xmltab (uri TEXT);
INSERT INTO xmltab VALUES ('good'), ('');
SELECT XMLElement(NAME "root", XMLNamespaces(uri AS zz)) from xmltab;
ERROR: invalid XML namespace URI for "zz"
DETAIL: a regular XML namespace cannot be a zero-length string

Imagine there millions of records and in the middle it fails.

I don't think discarding a feature just because the input data might
raise an exception in a long transaction is a strong argument here. For
your specific case, the user can always use a WHERE clause to filter out
any URIs that aren’t valid for XMLNamespace. Additionally, the
documentation already mentions this specific limitation:

"The value of a <replaceable>regular-nsuri</replaceable> cannot be a
zero-length string."

So it shouldn’t really catch anyone off guard :)

Thanks for the review!

Best, Jim

#14Umar Hayat
postgresql.wizard@gmail.com
In reply to: Jim Jones (#13)
Re: Add XMLNamespaces to XMLElement

Pavel and Jim,
If that's the case, it looks good to me.
Just wanted to highlight potential issues and implementation
differences compared to other databases.

Regards
Umar Hayat

#15Pavel Stehule
pavel.stehule@gmail.com
In reply to: Umar Hayat (#14)
Re: Add XMLNamespaces to XMLElement

po 27. 1. 2025 v 14:57 odesílatel Umar Hayat <postgresql.wizard@gmail.com>
napsal:

Pavel and Jim,
If that's the case, it looks good to me.
Just wanted to highlight potential issues and implementation
differences compared to other databases.

It is correct.

Every Time there will be some differences - DB2 has an absolutely
different stack for XML processing, Oracle has handy written parser, and
allows some syntaxes that Postgres does not, but for some cases it is
unfriendly, strict and restrictive.

And we are searching for some good compromise - between consistency with
self, consistency with standard, and good usability and good portability.

Regards

Pavel

Show quoted text

Regards
Umar Hayat

#16Jim Jones
jim.jones@uni-muenster.de
In reply to: Pavel Stehule (#15)
Re: Add XMLNamespaces to XMLElement

I've attached v6, which addresses two issues:

* Fixes a bug where XMLNAMESPACES declarations within a view were being
serialized as XMLATTRIBUTES.
* Prevents the makeString function from being called with a NULL
parameter - discussed in this thread [1]/messages/by-id/CAFj8pRC24FEBNfTUrDgAK8f2nqDVvzWCuq=R=T19nUjL9GuLBA@mail.gmail.com.

Best regards, Jim

[1]: /messages/by-id/CAFj8pRC24FEBNfTUrDgAK8f2nqDVvzWCuq=R=T19nUjL9GuLBA@mail.gmail.com
/messages/by-id/CAFj8pRC24FEBNfTUrDgAK8f2nqDVvzWCuq=R=T19nUjL9GuLBA@mail.gmail.com

Attachments:

v6-0001-Add-XMLNamespaces-option-to-XMLElement.patchtext/x-patch; charset=UTF-8; name=v6-0001-Add-XMLNamespaces-option-to-XMLElement.patchDownload+1200-35
#17newtglobal postgresql_contributors
postgresql_contributors@newtglobalcorp.com
In reply to: Jim Jones (#16)
Re: Add XMLNamespaces to XMLElement

The following review has been posted through the commitfest application:
make installcheck-world: not tested
Implements feature: not tested
Spec compliant: not tested
Documentation: not tested

Hi Pavel,

I have tested this patch, and it proves to be highly useful when handling XMLNAMESPACES() with both DEFAULT and NO DEFAULT options. The following test cases confirm its correctness:

SELECT xmlelement(
NAME "foo",
XMLNAMESPACES('http://x.y&#39; AS xy, 'http://a.b&#39; AS ab, DEFAULT 'http://d.e&#39;),
xmlelement(NAME "foot",
xmlelement(NAME "xy:shoe"),
xmlelement(NAME "ab:lace")
)
);

SELECT xmlelement(
NAME "foo",
XMLNAMESPACES('http://x.y&#39; AS xy, 'http://a.b&#39; AS ab, NO DEFAULT),
xmlelement(NAME "foot",
xmlelement(NAME "xy:shoe"),
xmlelement(NAME "ab:lace")
)
);
Additionally, I verified that the patch correctly supports multiple namespaces when using both DEFAULT and NO DEFAULT, ensuring expected behavior across different use cases.

Great work on this improvement!

#18Jim Jones
jim.jones@uni-muenster.de
In reply to: newtglobal postgresql_contributors (#17)
Re: Add XMLNamespaces to XMLElement

rebase

--
Jim

Attachments:

v7-0001-Add-XMLNamespaces-option-to-XMLElement.patchtext/x-patch; charset=UTF-8; name=v7-0001-Add-XMLNamespaces-option-to-XMLElement.patchDownload+1202-35
#19Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#18)
Re: Add XMLNamespaces to XMLElement

rebase

--
Jim

Attachments:

v8-0001-Add-XMLNamespaces-option-to-XMLElement.patchtext/x-patch; charset=UTF-8; name=v8-0001-Add-XMLNamespaces-option-to-XMLElement.patchDownload+1202-35
#20Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#19)
Re: Add XMLNamespaces to XMLElement

rebased due to changes in 2e94721

--
Jim

Attachments:

v9-0001-Add-XMLNamespaces-option-to-XMLElement.patchtext/x-patch; charset=UTF-8; name=v9-0001-Add-XMLNamespaces-option-to-XMLElement.patchDownload+1205-40
#21Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#20)
#22Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#21)
#23Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#22)
#24Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#23)