[PATCH] Add pretty-printed XML output option

Started by Jim Jonesabout 3 years ago72 messageshackers
Jump to latest
#1Jim Jones
jim.jones@uni-muenster.de

Hi,

This small patch introduces a XML pretty print function. It basically
takes advantage of the indentation feature of xmlDocDumpFormatMemory
from libxml2 to format XML strings.

postgres=# SELECT xmlpretty('<foo id="x"><bar id="y"><var
id="z">42</var></bar></foo>');
        xmlpretty
--------------------------
 <foo id="x">            +
   <bar id="y">          +
     <var id="z">42</var>+
   </bar>                +
 </foo>                  +

(1 row)

The patch also contains regression tests and documentation.

Feedback is very welcome!

Jim

Attachments:

v1-0001-Add-pretty-printed-XML-output-option.patchtext/x-patch; charset=UTF-8; name=v1-0001-Add-pretty-printed-XML-output-option.patchDownload+208-1
smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#2Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#1)
Re: [PATCH] Add pretty-printed XML output option

The system somehow returns different error messages in Linux and
MacOS/Windows, which was causing the cfbot to fail.

SELECT xmlpretty('<foo>')::xml;
                           ^
-DETAIL:  line 1: chunk is not well balanced
+DETAIL:  line 1: Premature end of data in tag foo line 1

Test removed in v2.

Show quoted text

On 02.02.23 21:35, Jim Jones wrote:

Hi,

This small patch introduces a XML pretty print function. It basically
takes advantage of the indentation feature of xmlDocDumpFormatMemory
from libxml2 to format XML strings.

postgres=# SELECT xmlpretty('<foo id="x"><bar id="y"><var
id="z">42</var></bar></foo>');
        xmlpretty
--------------------------
 <foo id="x">            +
   <bar id="y">          +
     <var id="z">42</var>+
   </bar>                +
 </foo>                  +

(1 row)

The patch also contains regression tests and documentation.

Feedback is very welcome!

Jim

Attachments:

v2-0001-Add-pretty-printed-XML-output-option.patchtext/x-patch; charset=UTF-8; name=v2-0001-Add-pretty-printed-XML-output-option.patchDownload+209-20
smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#3Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#2)
Re: [PATCH] Add pretty-printed XML output option

Hi,

The cfbot on "Windows - Server 2019, VS 2019 - Meson & ninja" is failing
the regression tests with the error:

ERROR:  unsupported XML feature
DETAIL:  This functionality requires the server to be built with libxml
support.

Is there anything I can do to enable libxml to run my regression tests?

v3 adds a missing xmlFree call.

Best, Jim

Attachments:

v3-0001-Add-pretty-printed-XML-output-option.patchtext/x-patch; charset=UTF-8; name=v3-0001-Add-pretty-printed-XML-output-option.patchDownload+219-25
#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#3)
Re: [PATCH] Add pretty-printed XML output option

Jim Jones <jim.jones@uni-muenster.de> writes:

The cfbot on "Windows - Server 2019, VS 2019 - Meson & ninja" is failing
the regression tests with the error:

ERROR:  unsupported XML feature
DETAIL:  This functionality requires the server to be built with libxml
support.

Is there anything I can do to enable libxml to run my regression tests?

Your patch needs to also update expected/xml_1.out to match the output
the test produces when run without --with-libxml.

regards, tom lane

#5Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#4)
Re: [PATCH] Add pretty-printed XML output option

On 06.02.23 17:23, Tom Lane wrote:

Your patch needs to also update expected/xml_1.out to match the output
the test produces when run without --with-libxml.

Thanks a lot! It seems to do the trick.

xml_1.out updated in v4.

Best, Jim

Attachments:

v4-0001-Add-pretty-printed-XML-output-option.patchtext/x-patch; charset=UTF-8; name=v4-0001-Add-pretty-printed-XML-output-option.patchDownload+264-26
smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#6Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#5)
Re: [PATCH] Add pretty-printed XML output option

while working on another item of the TODO list I realized that I should
be using a PG_TRY() block in he xmlDocDumpFormatMemory call.

Fixed in v5.

Best regards, Jim

Attachments:

v5-0001-Add-pretty-printed-XML-output-option.patchtext/x-patch; charset=UTF-8; name=v5-0001-Add-pretty-printed-XML-output-option.patchDownload+270-1
#7Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#6)
Re: [PATCH] Add pretty-printed XML output option

On Thu, Feb 9, 2023 at 7:31 AM Jim Jones <jim.jones@uni-muenster.de> wrote:

while working on another item of the TODO list I realized that I should
be using a PG_TRY() block in he xmlDocDumpFormatMemory call.

Fixed in v5.

I noticed the xmlFreeDoc(doc) within the PG_CATCH is guarded but the
other xmlFreeDoc(doc) is not. As the doc is assigned outside the
PG_TRY shouldn't those both be the same?

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

#8Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#7)
Re: [PATCH] Add pretty-printed XML output option

On 09.02.23 00:09, Peter Smith wrote:

I noticed the xmlFreeDoc(doc) within the PG_CATCH is guarded but the
other xmlFreeDoc(doc) is not. As the doc is assigned outside the
PG_TRY shouldn't those both be the same?

Hi Peter,

My logic there was the following: if program reached that part of the
code it means that the xml_parse() and xmlDocDumpFormatMemory() worked,
which consequently means that the variables doc and xmlbuf are != NULL,
therefore not needing to be checked. Am I missing something?

Thanks a lot for the review!

Best, Jim

Attachments:

smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#9Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#8)
Re: [PATCH] Add pretty-printed XML output option

On Thu, Feb 9, 2023 at 10:42 AM Jim Jones <jim.jones@uni-muenster.de> wrote:

On 09.02.23 00:09, Peter Smith wrote:

I noticed the xmlFreeDoc(doc) within the PG_CATCH is guarded but the
other xmlFreeDoc(doc) is not. As the doc is assigned outside the
PG_TRY shouldn't those both be the same?

Hi Peter,

My logic there was the following: if program reached that part of the
code it means that the xml_parse() and xmlDocDumpFormatMemory() worked,
which consequently means that the variables doc and xmlbuf are != NULL,
therefore not needing to be checked. Am I missing something?

Thanks. I think I understand it better now -- I expect
xmlDocDumpFormatMemory will cope OK when passed a NULL doc (see this
source [1]), but it will return nbytes of 0, but your code will still
throw ERROR, meaning the guard for doc NULL is necessary for the
PG_CATCH.

In that case, everything LGTM.

~

OTOH, if you are having to check for NULL doc anyway, maybe it's just
as easy only doing that up-front. Then you could quick-exit the
function without calling xmlDocDumpFormatMemory etc. in the first
place. For example:

doc = xml_parse(arg, XMLOPTION_DOCUMENT, false, GetDatabaseEncoding(), NULL);
if (!doc)
return 0;

------
Kind Regards,
Peter Smith.
Fujitsu Australia.

#10Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#9)
Re: [PATCH] Add pretty-printed XML output option

On 09.02.23 02:01, Peter Smith wrote:

OTOH, if you are having to check for NULL doc anyway, maybe it's just
as easy only doing that up-front. Then you could quick-exit the
function without calling xmlDocDumpFormatMemory etc. in the first
place. For example:

doc = xml_parse(arg, XMLOPTION_DOCUMENT, false, GetDatabaseEncoding(), NULL);
if (!doc)
return 0;

I see your point. If I got it right, you're suggesting the following
change in the PG_TRY();

   PG_TRY();
    {

        int nbytes;

        if(!doc)
            xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
                    "could not parse the given XML document");

        xmlDocDumpFormatMemory(doc, &xmlbuf, &nbytes, 1);

        if(!nbytes || xmlerrcxt->err_occurred)
            xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
                    "could not indent the given XML document");

        initStringInfo(&buf);
        appendStringInfoString(&buf, (const char *)xmlbuf);

    }

.. which will catch the doc == NULL before calling xmlDocDumpFormatMemory.

Is it what you suggest?

Thanks a lot for the thorough review!

Best, Jim

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#10)
Re: [PATCH] Add pretty-printed XML output option

Jim Jones <jim.jones@uni-muenster.de> writes:

I see your point. If I got it right, you're suggesting the following
change in the PG_TRY();

   PG_TRY();
    {

        int nbytes;

        if(!doc)
            xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
                    "could not parse the given XML document");

        xmlDocDumpFormatMemory(doc, &xmlbuf, &nbytes, 1);

        if(!nbytes || xmlerrcxt->err_occurred)
            xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
                    "could not indent the given XML document");

        initStringInfo(&buf);
        appendStringInfoString(&buf, (const char *)xmlbuf);

    }

.. which will catch the doc == NULL before calling xmlDocDumpFormatMemory.

Um ... why are you using PG_TRY here at all? It seems like
you have full control of the actually likely error cases.
The only plausible error out of the StringInfo calls is OOM,
and you probably don't want to trap that at all.

regards, tom lane

#12Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#11)
Re: [PATCH] Add pretty-printed XML output option

On 09.02.23 08:23, Tom Lane wrote:

Um ... why are you using PG_TRY here at all? It seems like
you have full control of the actually likely error cases.
The only plausible error out of the StringInfo calls is OOM,
and you probably don't want to trap that at all.

My intention was to catch any unexpected error from
xmlDocDumpFormatMemory and handle it properly. But I guess you're right,
I can control the likely error cases by checking doc and nbytes.

You suggest something along these lines?

    xmlDocPtr  doc;
    xmlChar    *xmlbuf = NULL;
    text       *arg = PG_GETARG_TEXT_PP(0);
    StringInfoData buf;
    int nbytes;

    doc = xml_parse(arg, XMLOPTION_DOCUMENT, false,
GetDatabaseEncoding(), NULL);

    if(!doc)
        elog(ERROR, "could not parse the given XML document");

    xmlDocDumpFormatMemory(doc, &xmlbuf, &nbytes, 1);

    xmlFreeDoc(doc);

    if(!nbytes)
        elog(ERROR, "could not indent the given XML document");

    initStringInfo(&buf);
    appendStringInfoString(&buf, (const char *)xmlbuf);

    xmlFree(xmlbuf);

    PG_RETURN_XML_P(stringinfo_to_xmltype(&buf));

Thanks!

Best, Jim

Attachments:

smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#13Peter Eisentraut
peter_e@gmx.net
In reply to: Jim Jones (#1)
Re: [PATCH] Add pretty-printed XML output option

On 02.02.23 21:35, Jim Jones wrote:

This small patch introduces a XML pretty print function. It basically
takes advantage of the indentation feature of xmlDocDumpFormatMemory
from libxml2 to format XML strings.

I suggest we call it "xmlformat", which is an established term for this.

#14Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#12)
Re: [PATCH] Add pretty-printed XML output option

On Thu, Feb 9, 2023 at 7:17 PM Jim Jones <jim.jones@uni-muenster.de> wrote:

On 09.02.23 08:23, Tom Lane wrote:

Um ... why are you using PG_TRY here at all? It seems like
you have full control of the actually likely error cases.
The only plausible error out of the StringInfo calls is OOM,
and you probably don't want to trap that at all.

My intention was to catch any unexpected error from
xmlDocDumpFormatMemory and handle it properly. But I guess you're right,
I can control the likely error cases by checking doc and nbytes.

You suggest something along these lines?

xmlDocPtr doc;
xmlChar *xmlbuf = NULL;
text *arg = PG_GETARG_TEXT_PP(0);
StringInfoData buf;
int nbytes;

doc = xml_parse(arg, XMLOPTION_DOCUMENT, false,
GetDatabaseEncoding(), NULL);

if(!doc)
elog(ERROR, "could not parse the given XML document");

xmlDocDumpFormatMemory(doc, &xmlbuf, &nbytes, 1);

xmlFreeDoc(doc);

if(!nbytes)
elog(ERROR, "could not indent the given XML document");

initStringInfo(&buf);
appendStringInfoString(&buf, (const char *)xmlbuf);

xmlFree(xmlbuf);

PG_RETURN_XML_P(stringinfo_to_xmltype(&buf));

Thanks!

Something like that LGTM, but here are some other minor comments.

======

1.
FYI, there are some whitespace warnings applying the v5 patch

[postgres@CentOS7-x64 oss_postgres_misc]$ git apply
../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch
../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:26:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:29:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:33:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:37:
trailing whitespace.

../patches_misc/v5-0001-Add-pretty-printed-XML-output-option.patch:41:
trailing whitespace.

warning: squelched 48 whitespace errors
warning: 53 lines add whitespace errors.

======
src/test/regress/sql/xml.sql

2.
The v5 patch was already testing NULL, but it might be good to add
more tests to verify the function is behaving how you want for edge
cases. For example,

+-- XML pretty print: NULL, empty string, spaces only...
SELECT xmlpretty(NULL);
SELECT xmlpretty('');
SELECT xmlpretty(' ');

~~

3.
The function is returning XML anyway, so is the '::xml' casting in
these tests necessary?

e.g.
SELECT xmlpretty(NULL)::xml; --> SELECT xmlpretty(NULL);

======
src/include/catalog/pg_proc.dat

4.

+  { oid => '4642', descr => 'Indented text from xml',
+  proname => 'xmlpretty', prorettype => 'xml',
+  proargtypes => 'xml', prosrc => 'xmlpretty' },

Spurious leading space for this new entry.

======
doc/src/sgml/func.sgml

5.
+ <foo id="x">
+   <bar id="y">
+     <var id="z">42</var>
+   </bar>
+ </foo>
+
+(1 row)
+
+]]></screen>

A spurious blank line in the example after the "(1 row)"

~~~

6.
Does this function docs belong in section 9.15.1 "Producing XML
Content"? Or (since it is not really producing content) should it be
moved to the 9.15.3 "Processing XML" section?

------
Kind Regards,
Peter Smith.
Fujitsu Australia

#15Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#14)
Re: [PATCH] Add pretty-printed XML output option

On 10.02.23 02:10, Peter Smith wrote:

On Thu, Feb 9, 2023 at 7:17 PM Jim Jones <jim.jones@uni-muenster.de> wrote:
1.
FYI, there are some whitespace warnings applying the v5 patch

Trailing whitespaces removed. The patch applies now without warnings.

======
src/test/regress/sql/xml.sql

2.
The v5 patch was already testing NULL, but it might be good to add
more tests to verify the function is behaving how you want for edge
cases. For example,

+-- XML pretty print: NULL, empty string, spaces only...
SELECT xmlpretty(NULL);
SELECT xmlpretty('');
SELECT xmlpretty(' ');

Test cases added.

3.
The function is returning XML anyway, so is the '::xml' casting in
these tests necessary?

e.g.
SELECT xmlpretty(NULL)::xml; --> SELECT xmlpretty(NULL);

It is indeed not necessary. Most likely I used it for testing and forgot
to remove it afterwards. Now removed.

======
src/include/catalog/pg_proc.dat

4.

+  { oid => '4642', descr => 'Indented text from xml',
+  proname => 'xmlpretty', prorettype => 'xml',
+  proargtypes => 'xml', prosrc => 'xmlpretty' },

Spurious leading space for this new entry.

Removed.

======
doc/src/sgml/func.sgml

5.
+ <foo id="x">
+   <bar id="y">
+     <var id="z">42</var>
+   </bar>
+ </foo>
+
+(1 row)
+
+]]></screen>

A spurious blank line in the example after the "(1 row)"

Removed.

~~~

6.
Does this function docs belong in section 9.15.1 "Producing XML
Content"? Or (since it is not really producing content) should it be
moved to the 9.15.3 "Processing XML" section?

Moved to the section 9.15.3

Following the suggestion of Peter Eisentraut I renamed the function to
xmlformat().

v6 attached.

Thanks a lot for the review.

Best, Jim

Attachments:

v6-0001-Add-pretty-printed-XML-output-option.patchtext/x-patch; charset=UTF-8; name=v6-0001-Add-pretty-printed-XML-output-option.patchDownload+280-1
smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#16Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#15)
Re: [PATCH] Add pretty-printed XML output option

Something is misbehaving.

Using the latest HEAD, and before applying the v6 patch, 'make check'
is working OK.

But after applying the v6 patch, some XML regression tests are failing
because the DETAIL part of the message indicating the wrong syntax
position is not getting displayed. Not only for your new tests -- but
for other XML tests too.

My ./configure looks like this:
./configure --prefix=/usr/local/pg_oss --with-libxml --enable-debug
--enable-cassert --enable-tap-tests CFLAGS="-ggdb -O0 -g3
-fno-omit-frame-pointer"

resulting in:
checking whether to build with XML support... yes
checking for libxml-2.0 >= 2.6.23... yes

~

e.g.(these are a sample of errors)

xml ... FAILED 2561 ms

@@ -344,8 +326,6 @@
 <twoerrors>&idontexist;</unbalanced>
                        ^
 line 1: Opening and ending tag mismatch: twoerrors line 1 and unbalanced
-<twoerrors>&idontexist;</unbalanced>
-                                    ^
 SELECT xmlparse(document '<nosuchprefix:tag/>');
       xmlparse
 ---------------------
@@ -1696,14 +1676,8 @@
 SELECT xmlformat('');
 ERROR:  invalid XML document
 DETAIL:  line 1: switching encoding : no input
-
-^
 line 1: Document is empty
-
-^
 -- XML format: invalid string (whitespaces)
 SELECT xmlformat('   ');
 ERROR:  invalid XML document
 DETAIL:  line 1: Start tag expected, '<' not found
-
-   ^

~~

Separately (but maybe it's related?), the CF-bot test also reported a
failure [1]https://api.cirrus-ci.com/v1/artifact/task/4802219812323328/testrun/build/testrun/regress/regress/regression.diffs with strange error detail differences.

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/xml.out
/tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/xml.out 2023-02-12
09:02:57.077569000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out
2023-02-12 09:05:45.148100000 +0000
@@ -1695,10 +1695,7 @@
 -- XML format: empty string
 SELECT xmlformat('');
 ERROR:  invalid XML document
-DETAIL:  line 1: switching encoding : no input
-
-^
-line 1: Document is empty
+DETAIL:  line 1: Document is empty

^
-- XML format: invalid string (whitespaces)

------
[1]: https://api.cirrus-ci.com/v1/artifact/task/4802219812323328/testrun/build/testrun/regress/regress/regression.diffs

Kind Regards,
Peter Smith.
Fujitsu Australia

#17Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#16)
Re: [PATCH] Add pretty-printed XML output option

On 13.02.23 02:15, Peter Smith wrote:

Something is misbehaving.

Using the latest HEAD, and before applying the v6 patch, 'make check'
is working OK.

But after applying the v6 patch, some XML regression tests are failing
because the DETAIL part of the message indicating the wrong syntax
position is not getting displayed. Not only for your new tests -- but
for other XML tests too.

Yes, I noticed it yesterday ... and I'm not sure how to solve it. It
seems that in the system is returning a different error message in the
FreeBSD patch tester, which is causing a regression test in this
particular OS to fail.

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/xml.out /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/xml.out	2023-02-12 09:02:57.077569000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out	2023-02-12 09:05:45.148100000 +0000
@@ -1695,10 +1695,7 @@
  -- XML format: empty string
  SELECT xmlformat('');
  ERROR:  invalid XML document
-DETAIL:  line 1: switching encoding : no input
-
-^
-line 1: Document is empty
+DETAIL:  line 1: Document is empty

^
-- XML format: invalid string (whitespaces)

Does anyone know if there is anything I can do to make the error
messages be consistent among different OS?

#18Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#17)
Re: [PATCH] Add pretty-printed XML output option

On 13.02.23 13:15, Jim Jones wrote:

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/xml.out /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/xml.out	2023-02-12 09:02:57.077569000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out	2023-02-12 09:05:45.148100000 +0000
@@ -1695,10 +1695,7 @@
-- XML format: empty string
SELECT xmlformat('');
ERROR:  invalid XML document
-DETAIL:  line 1: switching encoding : no input
-
-^
-line 1: Document is empty
+DETAIL:  line 1: Document is empty

^
-- XML format: invalid string (whitespaces)

I couldn't figure out why the error messages are different -- I'm
wondering if the issue is the test environment itself. I just removed
the troubling test case for now

SELECT xmlformat('');

v7 attached.

Thanks for reviewing this patch!

Best, Jim

Attachments:

v7-0001-Add-pretty-printed-XML-output-option.patchtext/x-patch; charset=UTF-8; name=v7-0001-Add-pretty-printed-XML-output-option.patchDownload+262-1
smime.p7sapplication/pkcs7-signature; name=smime.p7sDownload
#19Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#18)
Re: [PATCH] Add pretty-printed XML output option

On Wed, Feb 15, 2023 at 8:55 AM Jim Jones <jim.jones@uni-muenster.de> wrote:

On 13.02.23 13:15, Jim Jones wrote:

diff -U3 /tmp/cirrus-ci-build/src/test/regress/expected/xml.out /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out
--- /tmp/cirrus-ci-build/src/test/regress/expected/xml.out 2023-02-12 09:02:57.077569000 +0000
+++ /tmp/cirrus-ci-build/build/testrun/regress/regress/results/xml.out 2023-02-12 09:05:45.148100000 +0000
@@ -1695,10 +1695,7 @@
-- XML format: empty string
SELECT xmlformat('');
ERROR:  invalid XML document
-DETAIL:  line 1: switching encoding : no input
-
-^
-line 1: Document is empty
+DETAIL:  line 1: Document is empty

^
-- XML format: invalid string (whitespaces)

I couldn't figure out why the error messages are different -- I'm wondering if the issue is the test environment itself. I just removed the troubling test case for now

SELECT xmlformat('');

v7 attached.

Thanks for reviewing this patch!

Yesterday I looked at those cfbot configs and noticed all those
machines have different versions of libxml.

2.10.3
2.6.23
2.9.10
2.9.13

But I don't if version numbers have any impact on the different error
details or not.

~

The thing that puzzled me most is that in MY environment (CentOS7;
libxml 20901; PG --with-libxml build) I get this behaviour.

- Without your v6 patch 'make check' is all OK.

- With your v6 patch other XML tests (not only yours) of 'make check'
failed with different error messages.

- Similarly, if I keep the v6 patch but just change (in xmlformat) the
#ifdef USE_LIBXML to be #if 0, then only the new xmlformat tests fail,
but the other XML tests are working OK again.

Those results implied to me that this function code (in my environment
anyway) is somehow introducing a side effect causing the *other* XML
tests to fail.

But so far I was unable to identify the reason. Sorry, I don't know
this XML API well enough to help more.

------
Kind Regards,
Peter Smith.
Fujitsu Austalia.

#20Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#19)
Re: [PATCH] Add pretty-printed XML output option

On 14.02.23 23:45, Peter Smith wrote:

Those results implied to me that this function code (in my environment
anyway) is somehow introducing a side effect causing the *other* XML
tests to fail.

I believe I've found the issue. It is probably related to the XML OPTION
settings, as it seems to deliver different error messages when set to
DOCUMENT or CONTENT:

postgres=# SET XML OPTION CONTENT;
SET
postgres=# SELECT xmlformat('');
ERROR:  invalid XML document
DETAIL:  line 1: switching encoding : no input

^
line 1: Document is empty

^
postgres=# SET XML OPTION DOCUMENT;
SET
postgres=# SELECT xmlformat('');
ERROR:  invalid XML document
LINE 1: SELECT xmlformat('');
                         ^
DETAIL:  line 1: switching encoding : no input

^
line 1: Document is empty

^

v8 attached reintroduces the SELECT xmlformat('') test case and adds SET
XML OPTION DOCUMENT to the regression tests.

Best, Jim

Attachments:

v8-0001-Add-pretty-printed-XML-output-option.patchtext/x-patch; charset=UTF-8; name=v8-0001-Add-pretty-printed-XML-output-option.patchDownload+287-1
#21Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#20)
#22Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#21)
#23Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#22)
#24Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#23)
#25Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Peter Smith (#16)
#26Jim Jones
jim.jones@uni-muenster.de
In reply to: Alvaro Herrera (#25)
#27Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Jim Jones (#26)
#28Jim Jones
jim.jones@uni-muenster.de
In reply to: Alvaro Herrera (#27)
#29Jim Jones
jim.jones@uni-muenster.de
In reply to: Jim Jones (#28)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#25)
#31Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#29)
#32Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Peter Eisentraut (#13)
#33Jim Jones
jim.jones@uni-muenster.de
In reply to: Nikolay Samokhvalov (#32)
#34Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#31)
#35Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#31)
#36Andrey Borodin
amborodin@acm.org
In reply to: Jim Jones (#35)
#37Jim Jones
jim.jones@uni-muenster.de
In reply to: Nikolay Samokhvalov (#32)
#38Jim Jones
jim.jones@uni-muenster.de
In reply to: Andrey Borodin (#36)
#39Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Jim Jones (#37)
#40Peter Eisentraut
peter_e@gmx.net
In reply to: Nikolay Samokhvalov (#39)
#41Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Eisentraut (#40)
#42Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Jim Jones (#41)
#43Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#41)
#44Jim Jones
jim.jones@uni-muenster.de
In reply to: Nikolay Samokhvalov (#42)
#45Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#43)
#46Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#45)
#47Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#46)
#48Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#47)
#49Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Smith (#48)
#50Peter Eisentraut
peter_e@gmx.net
In reply to: Jim Jones (#49)
#51Jim Jones
jim.jones@uni-muenster.de
In reply to: Peter Eisentraut (#50)
#52Peter Smith
smithpb2250@gmail.com
In reply to: Jim Jones (#51)
#53Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#51)
#54Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#53)
#55Tom Lane
tgl@sss.pgh.pa.us
In reply to: Peter Smith (#52)
#56Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#55)
#57Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#56)
#58Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#57)
#59Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#55)
#60Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#58)
#61Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#60)
#62Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#61)
#63Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#62)
#64Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#63)
#65Daniel Gustafsson
daniel@yesql.se
In reply to: Tom Lane (#64)
#66Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#63)
#67Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#66)
#68Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#67)
#69Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#68)
#70Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#69)
#71Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Jones (#70)
#72Jim Jones
jim.jones@uni-muenster.de
In reply to: Tom Lane (#71)