xpath insert unexpected newlines
Hello guys,
It seems that xpath function add unexpected newlines in the xml elements it returns as array:
postgres=> select version();
version
------------------------------------------------------------
PostgreSQL 10.3, compiled by Visual C++ build 1800, 64-bit
(1 row)
postgres=> select ((xpath('/*',xml('<root><a/><b/><c/></root>')))[1])::text;
xpath
---------
<root> +
<a/> +
<b/> +
<c/> +
</root>
(1 row)
A workaround is to have at least one element with a value:
postgres=> select ((xpath('/*',xml('<root><a/><b/><c/>one value</root>')))[1])::text;
xpath
------------------------------------
<root><a/><b/><c/>one value</root>
(1 row)
Best regards.
-----------------------------------------
Moody's monitors email communications through its networks for regulatory compliance purposes and to protect its customers, employees and business and where allowed to do so by applicable law. The information contained in this e-mail message, and any attachment thereto, is confidential and may not be disclosed without our express permission. If you are not the intended recipient or an employee or agent responsible for delivering this message to the intended recipient, you are hereby notified that you have received this message in error and that any review, dissemination, distribution or copying of this message, or any attachment thereto, in whole or in part, is strictly prohibited. If you have received this message in error, please immediately notify us by telephone, fax or e-mail and delete the message and all of its attachments. Every effort is made to keep our network free from viruses. You should, however, review this e-mail message, as well as any attachment thereto, for viruses. We take no responsibility and have no liability for any computer virus which may be transferred via this e-mail message.
This email was sent to you by Moody’s Investors Service EMEA Limited
Registered office address:
One Canada Square
Canary Wharf
London, E14 5FA
Registered in England and Wales No: 8922701
-----------------------------------------
[ redirecting to pgsql-bugs ]
"Voillequin, Jean-Marc" <Jean-Marc.Voillequin@moodys.com> writes:
It seems that xpath function add unexpected newlines in the xml elements it returns as array:
postgres=> select ((xpath('/*',xml('<root><a/><b/><c/></root>')))[1])::text;
xpath
---------
<root> +
<a/> +
<b/> +
<c/> +
</root>
(1 row)
A workaround is to have at least one element with a value:
postgres=> select ((xpath('/*',xml('<root><a/><b/><c/>one value</root>')))[1])::text;
xpath
------------------------------------
<root><a/><b/><c/>one value</root>
(1 row)
Wow, that's bizarre. It seems that this behavior is down to xmlNodeDump,
which is what we use to produce xpath's output. I'm not sure why it
chooses to pretty-print one case and not the other, but I'd be inclined
to say that for PG's purposes we don't want any pretty-printing.
I tried this:
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index dae7d58..48b8034 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -3857,7 +3857,7 @@ xml_xmlnodetoxmltype(xmlNodePtr cur, PgXmlErrorContext *xmlerrcxt)
nodefree = (cur_copy->type == XML_DOCUMENT_NODE) ?
(void (*) (xmlNodePtr)) xmlFreeDoc : xmlFreeNode;
- bytes = xmlNodeDump(buf, NULL, cur_copy, 0, 1);
+ bytes = xmlNodeDump(buf, NULL, cur_copy, 0, 0);
if (bytes == -1 || xmlerrcxt->err_occurred)
xml_ereport(xmlerrcxt, ERROR, ERRCODE_OUT_OF_MEMORY,
"could not dump node");
and that makes the discrepancy go away (both results are printed without
any added whitespace). There is one change in the regression test
outputs, which is an xpath query that's been affected by this very
same issue:
SELECT xpath('//loc:piece', '<local:data xmlns:local="http://127.0.0.1" xmlns="http://127.0.0.2"><local:piece id="1"><internal>number one</internal><internal2/></local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc', 'http://127.0.0.1']]);
- xpath
---------------------------------------------------------------------------------------
- {"<local:piece xmlns:local=\"http://127.0.0.1\" xmlns=\"http://127.0.0.2\" id=\"1\">+
- <internal>number one</internal> +
- <internal2/> +
- </local:piece>","<local:piece xmlns:local=\"http://127.0.0.1\" id=\"2\"/>"}
+ xpath
+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
+ {"<local:piece xmlns:local=\"http://127.0.0.1\" xmlns=\"http://127.0.0.2\" id=\"1\"><internal>number one</internal><internal2/></local:piece>","<local:piece xmlns:local=\"http://127.0.0.1\" id=\"2\"/>"}
(1 row)
Thoughts?
regards, tom lane
I wrote:
[ redirecting to pgsql-bugs ]
"Voillequin, Jean-Marc" <Jean-Marc.Voillequin@moodys.com> writes:It seems that xpath function add unexpected newlines in the xml elements it returns as array:
I tried this: - bytes = xmlNodeDump(buf, NULL, cur_copy, 0, 1); + bytes = xmlNodeDump(buf, NULL, cur_copy, 0, 0); and that makes the discrepancy go away (both results are printed without any added whitespace).
Thoughts?
[ crickets ]
Hearing nothing, I propose to change this in HEAD but not back-patch.
It still seems like a bug, but given the lack of previous complaints,
I'm not sure we'd get plaudits for changing this behavior in back
branches.
regards, tom lane