Bug in XPATH() produces invalid XML values and probably un-restorable dumps

Started by Florian Pflugover 14 years ago2 messages
#1Florian Pflug
fgp@phlo.org
1 attachment(s)

Hi

While trying to figure out sensible semantics for XPATH() and scalar-value returning XPath expressions, I've stumbled upon a bug in XPATH() that allows invalid XML values to be produced. This is a serious problem because should such invalid values get inserted into an XML column, an un-restorable dump ensues.

Here's an example (REL9_0_STABLE as of a few days ago)

template1=# SELECT (XPATH('/*/text()', '<root>&lt;</root>'))[1];
xpath
-------
<

Since XPATH() returns XML[], this value has type XML, but clearly isn't well-formed. And behold, casting to TEXT and back to XML complains loudly.

template1=# SELECT (XPATH('/*/text()', '<root>&lt;</root>'))[1]::TEXT::XML;
ERROR: invalid XML content
DETAIL: Entity: line 1: parser error : StartTag: invalid element name
<
^

The culprit is xml_xmlnodetoxmltype() in backend/utils/adt/xml.c. For non-element nodes, it returns the result of xmlXPathCastNodeToString() verbatim, even though that function doesn't reverse the entity replacement that was done during parsing. Adding a call to escape_xml()
for non-element nodes fixes the problem

template1=# SELECT (XPATH('/*/text()', '<root>&lt;</root>'))[1];
xpath
-------
&lt;

Patch is attached.

best regards,
Florian Pflug

Attachments:

pg_xpath_invalidxml.v1.patchapplication/octet-stream; name=pg_xpath_invalidxml.v1.patchDownload
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index eaf5b4d..359cded 100644
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
*************** xml_xmlnodetoxmltype(xmlNodePtr cur)
*** 3278,3284 ****
  		str = xmlXPathCastNodeToString(cur);
  		PG_TRY();
  		{
! 			result = (xmltype *) cstring_to_text((char *) str);
  		}
  		PG_CATCH();
  		{
--- 3278,3284 ----
  		str = xmlXPathCastNodeToString(cur);
  		PG_TRY();
  		{
! 			result = (xmltype *) cstring_to_text(escape_xml((char *) str));
  		}
  		PG_CATCH();
  		{
#2Florian Pflug
fgp@phlo.org
In reply to: Florian Pflug (#1)
1 attachment(s)
Re: Bug in XPATH() produces invalid XML values and probably un-restorable dumps

On May31, 2011, at 20:50 , Florian Pflug wrote:

While trying to figure out sensible semantics for XPATH() and scalar-value returning XPath expressions, I've stumbled upon a bug in XPATH() that allows invalid XML values to be produced. This is a serious problem because should such invalid values get inserted into an XML column, an un-restorable dump ensues.

Here's an example (REL9_0_STABLE as of a few days ago)

template1=# SELECT (XPATH('/*/text()', '<root>&lt;</root>'))[1];
xpath
-------
<

...

Patch is attached.

I've added two tests to the xml regression test which highlight the issue.

Updated patch attached.

best regards,
Florian Pflug

Attachments:

pg_xpath_invalidxml.v2.patchapplication/octet-stream; name=pg_xpath_invalidxml.v2.patchDownload
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 702b9e3..fa584de 100644
*** a/src/backend/utils/adt/xml.c
--- b/src/backend/utils/adt/xml.c
*************** xml_xmlnodetoxmltype(xmlNodePtr cur)
*** 3280,3286 ****
  		str = xmlXPathCastNodeToString(cur);
  		PG_TRY();
  		{
! 			result = (xmltype *) cstring_to_text((char *) str);
  		}
  		PG_CATCH();
  		{
--- 3280,3286 ----
  		str = xmlXPathCastNodeToString(cur);
  		PG_TRY();
  		{
! 			result = (xmltype *) cstring_to_text(escape_xml((char *) str));
  		}
  		PG_CATCH();
  		{
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index eaa5a74..822f3b4 100644
*** a/src/test/regress/expected/xml.out
--- b/src/test/regress/expected/xml.out
*************** SELECT xpath('//b', '<a>one <b>two</b> t
*** 502,507 ****
--- 502,519 ----
   {<b>two</b>,<b>etc</b>}
  (1 row)
  
+ SELECT xpath('//text()', '<root>&lt;</root>');
+  xpath  
+ --------
+  {&lt;}
+ (1 row)
+ 
+ SELECT xpath('//@value', '<root value="&lt;"/>');
+  xpath  
+ --------
+  {&lt;}
+ (1 row)
+ 
  -- Test xmlexists and xpath_exists
  SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF '<towns><town>Bidford-on-Avon</town><town>Cwmbran</town><town>Bristol</town></towns>');
   xmlexists 
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index 711b435..d53131c 100644
*** a/src/test/regress/expected/xml_1.out
--- b/src/test/regress/expected/xml_1.out
*************** LINE 1: SELECT xpath('//b', '<a>one <b>t
*** 456,461 ****
--- 456,473 ----
                              ^
  DETAIL:  This functionality requires the server to be built with libxml support.
  HINT:  You need to rebuild PostgreSQL using --with-libxml.
+ SELECT xpath('//text()', '<root>&lt;</root>');
+ ERROR:  unsupported XML feature
+ LINE 1: SELECT xpath('//text()', '<root>&lt;</root>'...
+                                  ^
+ DETAIL:  This functionality requires the server to be built with libxml support.
+ HINT:  You need to rebuild PostgreSQL using --with-libxml.
+ SELECT xpath('//@value', '<root value="&lt;"/>');
+ ERROR:  unsupported XML feature
+ LINE 1: SELECT xpath('//@value', '<root value="&lt;"/>'...
+                                  ^
+ DETAIL:  This functionality requires the server to be built with libxml support.
+ HINT:  You need to rebuild PostgreSQL using --with-libxml.
  -- Test xmlexists and xpath_exists
  SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF '<towns><town>Bidford-on-Avon</town><town>Cwmbran</town><town>Bristol</town></towns>');
  ERROR:  unsupported XML feature
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index 717a1e7..40e711a 100644
*** a/src/test/regress/sql/xml.sql
--- b/src/test/regress/sql/xml.sql
*************** SELECT xpath('', '<!-- error -->');
*** 163,168 ****
--- 163,170 ----
  SELECT xpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
  SELECT xpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc', 'http://127.0.0.1']]);
  SELECT xpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
+ SELECT xpath('//text()', '<root>&lt;</root>');
+ SELECT xpath('//@value', '<root value="&lt;"/>');
  
  -- Test xmlexists and xpath_exists
  SELECT xmlexists('//town[text() = ''Toronto'']' PASSING BY REF '<towns><town>Bidford-on-Avon</town><town>Cwmbran</town><town>Bristol</town></towns>');