WIP: Fix invalid XML explain plans for track_io_timing

Started by Markus Winandabout 9 years ago3 messages
#1Markus Winand
markus.winand@winand.at
1 attachment(s)

Hi!

The XML output of explain potentially outputs the XML tag names "I/O-Write-Time"
and "I/O-Read-Time", which are invalid due to the slash.

This can easily be seen with this:
set track_io_timing = on;
explain (analyze true, buffers true, format xml) select 1;

[...]
<I/O-Read-Time>0.000</I/O-Read-Time>
<I/O-Write-Time>0.000</I/O-Write-Time>
[...]

Attached is a patch against master that translates slashes to dashes during XML
formatting (very much like spaces are already translated to dashes).

Removing the slash from those property names is another option, but is an
incompatible change to the other formats (neither JSON nor YAML have issues
with '/‘ in key names).

Although the patch fixes the problem for the moment, it is incomplete in that
sense that it continues to check against an incomplete black list. I guess
this is how it slipped in: XML explain was added in 9.0, I/O timings in 9.2.

Checking against an (abbreviated?) white list would be more future proof if new
explain properties are added. Let me know if you consider this a better approach.

I've also done a simple check to see if there are other dangerous
characters used in explain properties at the moment:

sed -n 's/.*ExplainProperty[^(]*(\s*"\([^"]*\)\".*/\1/p' src/backend/commands/explain.c |grep '[^-A-Za-z /]'

Result: none.

A similar check could be used at build-time to prevent introducing new property
names that invalidate the XML output (not sure if this could ever reach 100%
safety).

Comments?

--
Markus Winand - winand.at

Attachments:

0001-Fix-invalid-XML-explain-plans-for-track_io_timing.patchapplication/octet-stream; name=0001-Fix-invalid-XML-explain-plans-for-track_io_timing.patchDownload
From b9680099be491baf87c1baac80fe1bf27514bf89 Mon Sep 17 00:00:00 2001
From: Markus Winand <markus.winand@winand.at>
Date: Thu, 20 Oct 2016 13:15:23 +0200
Subject: [PATCH] Fix invalid XML explain plans for track_io_timing

Treat '/' in XML tag names (e.g., "I/O Read Time") like space
and translate it to '-' when formatting an explain plan as XML.
---
 src/backend/commands/explain.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 1247433..91b9158 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -3312,7 +3312,7 @@ ExplainSeparatePlans(ExplainState *es)
  * Optionally, OR in X_NOWHITESPACE to suppress the whitespace we'd normally
  * add.
  *
- * XML tag names can't contain white space, so we replace any spaces in
+ * XML tag names can't contain white space or slashes, so we replace them in
  * "tagname" with dashes.
  */
 static void
@@ -3326,7 +3326,7 @@ ExplainXMLTag(const char *tagname, int flags, ExplainState *es)
 	if ((flags & X_CLOSING) != 0)
 		appendStringInfoCharMacro(es->str, '/');
 	for (s = tagname; *s; s++)
-		appendStringInfoCharMacro(es->str, (*s == ' ') ? '-' : *s);
+		appendStringInfoCharMacro(es->str, (*s == ' ' || *s == '/') ? '-' : *s);
 	if ((flags & X_CLOSE_IMMEDIATE) != 0)
 		appendStringInfoString(es->str, " /");
 	appendStringInfoCharMacro(es->str, '>');
-- 
2.8.4 (Apple Git-73)

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Markus Winand (#1)
Re: WIP: Fix invalid XML explain plans for track_io_timing

Markus Winand <markus.winand@winand.at> writes:

The XML output of explain potentially outputs the XML tag names "I/O-Write-Time"
and "I/O-Read-Time", which are invalid due to the slash.

Ooops.

Although the patch fixes the problem for the moment, it is incomplete in that
sense that it continues to check against an incomplete black list. I guess
this is how it slipped in: XML explain was added in 9.0, I/O timings in 9.2.

Yeah. The whitelist approach would look something like

appendStringInfoChar(es->str, strchr(XMLCHARS, *s) ? *s : '-');

which would be quite a few more cycles than just testing for ' ' and '/'.
So I'm not sure it's worth it. On the other hand, I have little faith
that we wouldn't make a similar mistake in future.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#2)
Re: WIP: Fix invalid XML explain plans for track_io_timing

I wrote:

Markus Winand <markus.winand@winand.at> writes:

The XML output of explain potentially outputs the XML tag names "I/O-Write-Time"
and "I/O-Read-Time", which are invalid due to the slash.

Ooops.

After further thought I decided we should go with the whitelist solution.
The extra time needed to produce XML-format output isn't really likely to
bother anyone. And given that this bug escaped notice for several years,
it seems likely that the next time somebody decides to be creative about
picking a tag name, we might not notice an XML syntax violation for
several more years. So a future-proof fix seems advisable.

I pushed a patch using the strchr() approach. Thanks for reporting this!

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers