proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Started by Pavel Stehulealmost 9 years ago35 messages
#1Pavel Stehule
pavel.stehule@gmail.com
1 attachment(s)

Hi

This proposal is followup of implementation of XMLTABLE.

Lot of XML documents has assigned document namespace.

<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;

For these XML document any search path must use schema "http://x.y&quot;. This
is not too intuitive, and from XMLTABLE usage is not too user friendly,
because the default column path (same like column name) cannot be used. A
solution of this issue is default namespace - defined in SQL/XML.

example - related to previous xml

without default namespace:
XMLTABLE(NAMESPACES('http://x.y&#39; AS aux),
'/aux:rows/aux:row' PASSING ...
COLUMNS a int PATH 'aux:a')

with default namespace
XMLTABLE(NAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' PASSING ...
COLUMNS a int);

Unfortunately the libxml2 doesn't support default namespaces in XPath
expressions. Because the libxml2 functionality is frozen, there is not big
chance for support in near future. A implementation is not too hard -
although it requires simple XPath expressions state translator.

The databases with XMLTABLE implementation supports default namespace for
XPath expressions.

The patch for initial implementation is attached.

Regards

Pavel

Attachments:

xml-xpath-default-ns.patchtext/x-patch; charset=US-ASCII; name=xml-xpath-default-ns.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 583b3b241a..c2558a33ef 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10465,8 +10465,7 @@ SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</m
     <para>
      The optional <literal>XMLNAMESPACES</> clause is a comma-separated
      list of namespaces.  It specifies the XML namespaces used in
-     the document and their aliases. A default namespace specification
-     is not currently supported.
+     the document and their aliases.
     </para>
 
     <para>
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 0f512753e4..5a3715cc84 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
 	tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
 	tsvector.o tsvector_op.o tsvector_parser.o \
 	txid.o uuid.o varbit.o varchar.o varlena.o version.o \
-	windowfuncs.o xid.o xml.o
+	windowfuncs.o xid.o xml.o xpath_parser.o
 
 like.o: like.c like_match.c
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index f81cf489d2..d59a76f0b4 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -91,7 +91,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/xml.h"
-
+#include "utils/xpath_parser.h"
 
 /* GUC variables */
 int			xmlbinary;
@@ -184,6 +184,7 @@ typedef struct XmlTableBuilderData
 	xmlXPathCompExprPtr xpathcomp;
 	xmlXPathObjectPtr xpathobj;
 	xmlXPathCompExprPtr *xpathscomp;
+	bool		with_default_ns;
 } XmlTableBuilderData;
 #endif
 
@@ -4180,6 +4181,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts)
 	xtCxt->magic = XMLTABLE_CONTEXT_MAGIC;
 	xtCxt->natts = natts;
 	xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts);
+	xtCxt->with_default_ns = false;
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -4272,6 +4274,8 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value)
 #endif   /* not USE_LIBXML */
 }
 
+#define DEFAULT_NAMESPACE_NAME		"pgdefnamespace"
+
 /*
  * XmlTableSetNamespace
  *		Add a namespace declaration
@@ -4282,12 +4286,14 @@ XmlTableSetNamespace(TableFuncScanState *state, char *name, char *uri)
 #ifdef USE_LIBXML
 	XmlTableBuilderData *xtCxt;
 
-	if (name == NULL)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DEFAULT namespace is not supported")));
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace");
 
+	if (name == NULL)
+	{
+		xtCxt->with_default_ns = true;
+		name = DEFAULT_NAMESPACE_NAME;
+	}
+
 	if (xmlXPathRegisterNs(xtCxt->xpathcxt,
 						   pg_xmlCharStrndup(name, strlen(name)),
 						   pg_xmlCharStrndup(uri, strlen(uri))))
@@ -4316,6 +4322,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, char *path)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("row path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathcomp = xmlXPathCompile(xstr);
@@ -4347,6 +4361,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, char *path, int colnum)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("column path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr);
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
new file mode 100644
index 0000000000..7ec2d584c6
--- /dev/null
+++ b/src/backend/utils/adt/xpath_parser.c
@@ -0,0 +1,323 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.c
+ *	  XML XPath parser.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/backend/utils/adt/xpath_parser.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "utils/xpath_parser.h"
+
+/*
+ * All PostgreSQL XML related functionality is based on libxml2 library, and
+ * XPath support is not an exception.  However, libxml2 doesn't support
+ * default namespace for XPath expressions. Because there are not any API
+ * how to transform or access to parsed XPath expression we have to parse
+ * XPath here.
+ *
+ * Those functionalities are implemented with a simple XPath parser/
+ * preprocessor.  This XPath parser transforms a XPath expression to another
+ * XPath expression that can be used by libxml2 XPath evaluation. It doesn't
+ * replace libxml2 XPath parser or libxml2 XPath expression evaluation.
+ */
+
+#ifdef USE_LIBXML
+
+/*
+ * We need to work with XPath expression tokens.  When expression starting with
+ * nodename, then we can use prefix.  When default namespace is defined, then we
+ * should to enhance any nodename and attribute without namespace by default
+ * namespace.
+ */
+
+typedef enum
+{
+	XPATH_TOKEN_NONE,
+	XPATH_TOKEN_NAME,
+	XPATH_TOKEN_STRING,
+	XPATH_TOKEN_NUMBER,
+	XPATH_TOKEN_OTHER
+}	XPathTokenType;
+
+typedef struct XPathTokenInfo
+{
+	XPathTokenType ttype;
+	char	   *start;
+	int			length;
+}	XPathTokenInfo;
+
+#define TOKEN_STACK_SIZE		10
+
+typedef struct ParserData
+{
+	char	   *str;
+	char	   *cur;
+	XPathTokenInfo stack[TOKEN_STACK_SIZE];
+	int			stack_length;
+}	XPathParserData;
+
+/* Any high-bit-set character is OK (might be part of a multibyte char) */
+#define NODENAME_FIRSTCHAR(c)	 ((c) == '_' || (c) == '-' || \
+								 ((c) >= 'A' && (c) <= 'Z') || \
+								 ((c) >= 'a' && (c) <= 'z') || \
+								 (IS_HIGHBIT_SET(c)))
+
+#define IS_NODENAME_CHAR(c)		(NODENAME_FIRSTCHAR(c) || (c) == '.' || \
+								 ((c) >= '0' && (c) <= '9'))
+
+
+/*
+ * Returns next char after last char of token - XPath lexer
+ */
+static char *
+getXPathToken(char *str, XPathTokenInfo * ti)
+{
+	/* skip initial spaces */
+	while (*str == ' ')
+		str++;
+
+	if (*str != '\0')
+	{
+		char		c = *str;
+
+		ti->start = str++;
+
+		if (c >= '0' && c <= '9')
+		{
+			while (*str >= '0' && *str <= '9')
+				str++;
+			if (*str == '.')
+			{
+				str++;
+				while (*str >= '0' && *str <= '9')
+					str++;
+			}
+			ti->ttype = XPATH_TOKEN_NUMBER;
+		}
+		else if (NODENAME_FIRSTCHAR(c))
+		{
+			while (IS_NODENAME_CHAR(*str))
+				str++;
+
+			ti->ttype = XPATH_TOKEN_NAME;
+		}
+		else if (c == '"')
+		{
+			while (*str != '\0')
+				if (*str++ == '"')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else
+			ti->ttype = XPATH_TOKEN_OTHER;
+
+		ti->length = str - ti->start;
+	}
+	else
+	{
+		ti->start = NULL;
+		ti->length = 0;
+
+		ti->ttype = XPATH_TOKEN_NONE;
+	}
+
+	return str;
+}
+
+/*
+ * reset XPath parser stack
+ */
+static void
+initXPathParser(XPathParserData * parser, char *str)
+{
+	parser->str = str;
+	parser->cur = str;
+	parser->stack_length = 0;
+}
+
+/*
+ * Returns token from stack or read token
+ */
+static void
+nextXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (parser->stack_length > 0)
+		memcpy(ti, &parser->stack[--parser->stack_length],
+			   sizeof(XPathTokenInfo));
+	else
+		parser->cur = getXPathToken(parser->cur, ti);
+}
+
+/*
+ * Push token to stack
+ */
+static void
+pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (parser->stack_length == TOKEN_STACK_SIZE)
+		elog(ERROR, "internal error");
+	memcpy(&parser->stack[parser->stack_length++], ti,
+		   sizeof(XPathTokenInfo));
+}
+
+/*
+ * Write token to output string
+ */
+static void
+writeXPathToken(StringInfo str, XPathTokenInfo * ti)
+{
+	Assert(ti->ttype != XPATH_TOKEN_NONE);
+
+	if (ti->ttype != XPATH_TOKEN_OTHER)
+		appendBinaryStringInfo(str, ti->start, ti->length);
+	else
+		appendStringInfoChar(str, *ti->start);
+}
+
+/*
+ * This is main part of XPath transformation. It can be called recursivly,
+ * when XPath expression contains predicates.
+ */
+static void
+_transformXPath(StringInfo str, XPathParserData * parser,
+				bool inside_predicate,
+				char *def_namespace_name)
+{
+	XPathTokenInfo t1,
+				t2;
+	bool		last_token_is_name = false;
+
+	nextXPathToken(parser, &t1);
+
+	while (t1.ttype != XPATH_TOKEN_NONE)
+	{
+		switch (t1.ttype)
+		{
+			case XPATH_TOKEN_NUMBER:
+			case XPATH_TOKEN_STRING:
+				last_token_is_name = false;
+				writeXPathToken(str, &t1);
+				nextXPathToken(parser, &t1);
+				break;
+
+			case XPATH_TOKEN_NAME:
+				{
+					bool		is_qual_name = false;
+
+					/* inside predicate ignore keywords "and" "or" */
+					if (inside_predicate)
+					{
+						if ((strncmp(t1.start, "and", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "or", 2) == 0 && t1.length == 2))
+						{
+							writeXPathToken(str, &t1);
+							nextXPathToken(parser, &t1);
+							break;
+						}
+					}
+
+					last_token_is_name = true;
+					nextXPathToken(parser, &t2);
+					if (t2.ttype == XPATH_TOKEN_OTHER)
+					{
+						if (*t2.start == '(')
+							last_token_is_name = false;
+						else if (*t2.start == ':')
+							is_qual_name = true;
+					}
+
+					if (last_token_is_name && !is_qual_name && def_namespace_name != NULL)
+						appendStringInfo(str, "%s:", def_namespace_name);
+
+					writeXPathToken(str, &t1);
+
+					if (is_qual_name)
+					{
+						writeXPathToken(str, &t2);
+						nextXPathToken(parser, &t1);
+						if (t1.ttype == XPATH_TOKEN_NAME)
+							writeXPathToken(str, &t1);
+						else
+							pushXPathToken(parser, &t1);
+					}
+					else
+						pushXPathToken(parser, &t2);
+
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_OTHER:
+				{
+					char		c = *t1.start;
+
+					writeXPathToken(str, &t1);
+
+					if (c == '[')
+						_transformXPath(str, parser, true, def_namespace_name);
+					else
+					{
+						last_token_is_name = false;
+
+						if (c == ']' && inside_predicate)
+							return;
+
+						else if (c == '@')
+						{
+							nextXPathToken(parser, &t1);
+							if (t1.ttype == XPATH_TOKEN_NAME)
+							{
+								bool		is_qual_name = false;
+
+								nextXPathToken(parser, &t2);
+								if (t2.ttype == XPATH_TOKEN_OTHER && *t2.start == ':')
+									is_qual_name = true;
+
+								if (!is_qual_name && def_namespace_name != NULL)
+									appendStringInfo(str, "%s:", def_namespace_name);
+
+								writeXPathToken(str, &t1);
+								if (is_qual_name)
+								{
+									writeXPathToken(str, &t2);
+									nextXPathToken(parser, &t1);
+									if (t1.ttype == XPATH_TOKEN_NAME)
+										writeXPathToken(str, &t1);
+									else
+										pushXPathToken(parser, &t1);
+								}
+								else
+									pushXPathToken(parser, &t2);
+							}
+							else
+								pushXPathToken(parser, &t1);
+						}
+					}
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_NONE:
+				elog(ERROR, "should not be here");
+		}
+	}
+}
+
+void
+transformXPath(StringInfo str, char *xpath,
+			   char *def_namespace_name)
+{
+	XPathParserData parser;
+
+	initStringInfo(str);
+	initXPathParser(&parser, xpath);
+	_transformXPath(str, &parser, false, def_namespace_name);
+}
+
+#endif
diff --git a/src/include/utils/xpath_parser.h b/src/include/utils/xpath_parser.h
new file mode 100644
index 0000000000..b2fc239e12
--- /dev/null
+++ b/src/include/utils/xpath_parser.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.h
+ *	  Declarations for XML XPath transformation.
+ *
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/xml.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef XPATH_PARSER_H
+#define XPATH_PARSER_H
+
+#include "postgres.h"
+#include "lib/stringinfo.h"
+
+void transformXPath(StringInfo str, char *xpath, char *def_namespace_name);
+
+#endif   /* XPATH_PARSER_H */
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index bcc585d427..9c543edad6 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1085,7 +1085,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
#2Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Pavel Stehule (#1)
Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hello, this patch have been ignored for a long time since its proposal...

At Sat, 11 Mar 2017 20:44:31 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in <CAFj8pRB+WDyDcZyGmfRdJ0HOoXugeaL-KNFeK9YA5Z10JN9qfA@mail.gmail.com>

Hi

This proposal is followup of implementation of XMLTABLE.

Lot of XML documents has assigned document namespace.

<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;

For these XML document any search path must use schema "http://x.y&quot;. This
is not too intuitive, and from XMLTABLE usage is not too user friendly,
because the default column path (same like column name) cannot be used. A
solution of this issue is default namespace - defined in SQL/XML.

example - related to previous xml

without default namespace:
XMLTABLE(NAMESPACES('http://x.y&#39; AS aux),
'/aux:rows/aux:row' PASSING ...
COLUMNS a int PATH 'aux:a')

with default namespace
XMLTABLE(NAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' PASSING ...
COLUMNS a int);

Unfortunately the libxml2 doesn't support default namespaces in XPath
expressions. Because the libxml2 functionality is frozen, there is not big
chance for support in near future. A implementation is not too hard -
although it requires simple XPath expressions state translator.

The databases with XMLTABLE implementation supports default namespace for
XPath expressions.

The patch for initial implementation is attached.

The original message is a bit less informative for those who
wants to review this but are not accustomed (like me) to this
area. I try to augment this with a bit more information. (Perhaps)

An example of this issue can be as follows.

create table t1 (id int, doc xml);
insert into t1
values
(1, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;),
(2, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;20&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;),
(3, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;30&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;),
(4, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;40&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;);
select x.* from t1, xmltable('/rows/row' passing t1.doc columns data int PATH 'a') as x;
| data
| ------
| (0 rows)
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; as n), '/n:rows/n:row' passing t1.doc columns data int PATH 'n:a') as x;
| data
| ------
| 10
| 20
| 30
| 40
| (4 rows)

But, currently the follwing command fails with error.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;), '/rows/row' passing t1.doc columns data int PATH 'a') as x;
| ERROR: DEFAULT namespace is not supported

This patch let PostgreSQL allow this syntax by transforming xpath
string when DEFAULT namespace is defined.

=======================
I have some review comments.

This patch still applies with shifts and works as expected.

1. Uniformity among simliar features

As mentioned in the proposal, but it is lack of uniformity that
the xpath transformer is applied only to xmltable and not for
other xpath related functions.

2. XML comformance

I'm not yet sure this works for the all extent of the
available syntax but at least fails for the following
expression.

(delete from t1;)
insert into t1
values
(5, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a hoge="haha">50</a></row></rows>');

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;), '/rows/row' passing t1.doc columns data int PATH 'a[1][@hoge]') as x;

data
------

(1 row)

The following expression works.

select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; as x), '/x:rows/x:row' passing t1.doc columns data int PATH 'x:a[1][@hoge]') as x;

data
------
50
(1 row)

The w3c says as follows.

https://www.w3.org/TR/xml-names/#defaulting

The namespace name for an unprefixed attribute name always has no value.

We currently don't have a means to make sure that this works
correctly for the whole extent. More regression test helps?

3. The predefined prefix for default namespace

The patch defines the name of the defaut namespace as
"pgdefnamespace". If a default namespace is defined, a
namespace is made with the name and with_default_ns is true. If
a namespace with the name is defined, a namespace is made also
with the same name but with_default_ns is false. This causes a
confused behavior.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.x&#39;, 'http://x.y&#39; as pgdefnamespace), '/rows/row' passing t1.doc columns data int PATH 'a') as x;
| data
| ------
| 10
| 20
| 30
| 40
| (4 rows)

The reason for the design is the fact that xmlXPathRegisterNs
doesn't accept NULL or empty string as a namespace prefix and it
only accepts a string consists of valid XML caharacters.

Even if we are to live with such restriction and such name is
hardly used, a namespace prefix with the name should be
rejected.

4. A mistake in the documentaion ?

The documentaion says about the XMLNAMESPACES clause as the
folows.

https://www.postgresql.org/docs/10/static/functions-xml.html

The optional XMLNAMESPACES clause is a comma-separated list of
namespaces. It specifies the XML namespaces used in the document
and their aliases.

As far as looking into XmlTableSetNamespace, (and if I read the
documentation correctly) the defined namespaces are not applied
on documents, but expressions. This patch is not to blame for
this. So this will be another patch backbatchable to Pg10.

| The optional XMLNAMESPACES clause is a comma-separated list of
| namespaces. It specifies the XML namespaces used in the
| row_expression.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Kyotaro HORIGUCHI (#2)
1 attachment(s)
Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi

2017-09-25 13:25 GMT+02:00 Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp>:

Hello, this patch have been ignored for a long time since its proposal...

At Sat, 11 Mar 2017 20:44:31 +0100, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRB+WDyDcZyGmfRdJ0HOoXugeaL-KNFeK9YA5Z10JN9qfA@mail.gmail.
com>

Hi

This proposal is followup of implementation of XMLTABLE.

Lot of XML documents has assigned document namespace.

<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;

For these XML document any search path must use schema "http://x.y&quot;.

This

is not too intuitive, and from XMLTABLE usage is not too user friendly,
because the default column path (same like column name) cannot be used. A
solution of this issue is default namespace - defined in SQL/XML.

example - related to previous xml

without default namespace:
XMLTABLE(NAMESPACES('http://x.y&#39; AS aux),
'/aux:rows/aux:row' PASSING ...
COLUMNS a int PATH 'aux:a')

with default namespace
XMLTABLE(NAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' PASSING ...
COLUMNS a int);

Unfortunately the libxml2 doesn't support default namespaces in XPath
expressions. Because the libxml2 functionality is frozen, there is not

big

chance for support in near future. A implementation is not too hard -
although it requires simple XPath expressions state translator.

The databases with XMLTABLE implementation supports default namespace for
XPath expressions.

The patch for initial implementation is attached.

The original message is a bit less informative for those who
wants to review this but are not accustomed (like me) to this
area. I try to augment this with a bit more information. (Perhaps)

An example of this issue can be as follows.

create table t1 (id int, doc xml);
insert into t1
values
(1, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;10&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;),
(2, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;20&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;),
(3, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;30&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;),
(4, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a&gt;40&lt;/a&gt;&lt;/row&gt;&lt;/rows&gt;&#39;);
select x.* from t1, xmltable('/rows/row' passing t1.doc columns data int
PATH 'a') as x;
| data
| ------
| (0 rows)
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; as n),
'/n:rows/n:row' passing t1.doc columns data int PATH 'n:a') as x;
| data
| ------
| 10
| 20
| 30
| 40
| (4 rows)

But, currently the follwing command fails with error.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' passing t1.doc columns data int PATH 'a') as x;
| ERROR: DEFAULT namespace is not supported

This patch let PostgreSQL allow this syntax by transforming xpath
string when DEFAULT namespace is defined.

=======================
I have some review comments.

This patch still applies with shifts and works as expected.

1. Uniformity among simliar features

As mentioned in the proposal, but it is lack of uniformity that
the xpath transformer is applied only to xmltable and not for
other xpath related functions.

I have to fix the XPath function. The SQL/XML function Xmlexists doesn't
support namespaces/

2. XML comformance

I'm not yet sure this works for the all extent of the
available syntax but at least fails for the following
expression.

(delete from t1;)
insert into t1
values
(5, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a hoge="haha">50</a></row></
rows>');

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' passing t1.doc columns data int PATH 'a[1][@hoge]') as x;

data
------

(1 row)

The following expression works.

select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; as x),
'/x:rows/x:row' passing t1.doc columns data int PATH 'x:a[1][@hoge]') as x;

data
------
50
(1 row)

The w3c says as follows.

https://www.w3.org/TR/xml-names/#defaulting

The namespace name for an unprefixed attribute name always has no

value.

We currently don't have a means to make sure that this works
correctly for the whole extent. More regression test helps?

I fixed this issue and I used your examples as regression tests

3. The predefined prefix for default namespace

The patch defines the name of the defaut namespace as
"pgdefnamespace". If a default namespace is defined, a
namespace is made with the name and with_default_ns is true. If
a namespace with the name is defined, a namespace is made also
with the same name but with_default_ns is false. This causes a
confused behavior.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.x&#39;, '
http://x.y&#39; as pgdefnamespace), '/rows/row' passing t1.doc columns data
int PATH 'a') as x;
| data
| ------
| 10
| 20
| 30
| 40
| (4 rows)

The reason for the design is the fact that xmlXPathRegisterNs
doesn't accept NULL or empty string as a namespace prefix and it
only accepts a string consists of valid XML caharacters.

Even if we are to live with such restriction and such name is
hardly used, a namespace prefix with the name should be
rejected.

it is fixed - look to regress test

4. A mistake in the documentaion ?

The documentaion says about the XMLNAMESPACES clause as the
folows.

https://www.postgresql.org/docs/10/static/functions-xml.html

The optional XMLNAMESPACES clause is a comma-separated list of
namespaces. It specifies the XML namespaces used in the document
and their aliases.

As far as looking into XmlTableSetNamespace, (and if I read the
documentation correctly) the defined namespaces are not applied
on documents, but expressions. This patch is not to blame for
this. So this will be another patch backbatchable to Pg10.

| The optional XMLNAMESPACES clause is a comma-separated list of
| namespaces. It specifies the XML namespaces used in the
| row_expression.

Regards

Pavel

Show quoted text

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

xml-xpath-default-ns-2.patchtext/x-patch; charset=US-ASCII; name=xml-xpath-default-ns-2.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2f036015cc..e29ef152fd 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10571,8 +10571,7 @@ SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</m
     <para>
      The optional <literal>XMLNAMESPACES</> clause is a comma-separated
      list of namespaces.  It specifies the XML namespaces used in
-     the document and their aliases. A default namespace specification
-     is not currently supported.
+     the document and their aliases.
     </para>
 
     <para>
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 1fb018416e..b60a3cfe0d 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
 	tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
 	tsvector.o tsvector_op.o tsvector_parser.o \
 	txid.o uuid.o varbit.o varchar.o varlena.o version.o \
-	windowfuncs.o xid.o xml.o
+	windowfuncs.o xid.o xml.o xpath_parser.o
 
 like.o: like.c like_match.c
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 24229c2dff..75f33cfc71 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -90,7 +90,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/xml.h"
-
+#include "utils/xpath_parser.h"
 
 /* GUC variables */
 int			xmlbinary;
@@ -187,6 +187,7 @@ typedef struct XmlTableBuilderData
 	xmlXPathCompExprPtr xpathcomp;
 	xmlXPathObjectPtr xpathobj;
 	xmlXPathCompExprPtr *xpathscomp;
+	bool		with_default_ns;
 } XmlTableBuilderData;
 #endif
 
@@ -4195,6 +4196,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts)
 	xtCxt->magic = XMLTABLE_CONTEXT_MAGIC;
 	xtCxt->natts = natts;
 	xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts);
+	xtCxt->with_default_ns = false;
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -4287,6 +4289,8 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value)
 #endif							/* not USE_LIBXML */
 }
 
+#define DEFAULT_NAMESPACE_NAME		"pgdefnamespace.pgsqlxml.internal"
+
 /*
  * XmlTableSetNamespace
  *		Add a namespace declaration
@@ -4297,12 +4301,24 @@ XmlTableSetNamespace(TableFuncScanState *state, char *name, char *uri)
 #ifdef USE_LIBXML
 	XmlTableBuilderData *xtCxt;
 
-	if (name == NULL)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DEFAULT namespace is not supported")));
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace");
 
+	if (name != NULL)
+	{
+		/* Don't allow same namespace as out internal default namespace name */
+		if (strcmp(name, DEFAULT_NAMESPACE_NAME) == 0)
+			ereport(ERROR,
+						(errmsg("cannot to use \"%s\" as namespace name",
+								  DEFAULT_NAMESPACE_NAME),
+						 errdetail("\"%s\" is reserved for internal purpose",
+								  DEFAULT_NAMESPACE_NAME)));
+	}
+	else
+	{
+		xtCxt->with_default_ns = true;
+		name = DEFAULT_NAMESPACE_NAME;
+	}
+
 	if (xmlXPathRegisterNs(xtCxt->xpathcxt,
 						   pg_xmlCharStrndup(name, strlen(name)),
 						   pg_xmlCharStrndup(uri, strlen(uri))))
@@ -4331,6 +4347,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, char *path)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("row path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathcomp = xmlXPathCompile(xstr);
@@ -4362,6 +4386,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, char *path, int colnum)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("column path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr);
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
new file mode 100644
index 0000000000..ed5a071a0a
--- /dev/null
+++ b/src/backend/utils/adt/xpath_parser.c
@@ -0,0 +1,328 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.c
+ *	  XML XPath parser.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/backend/utils/adt/xpath_parser.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "utils/xpath_parser.h"
+
+/*
+ * All PostgreSQL XML related functionality is based on libxml2 library, and
+ * XPath support is not an exception.  However, libxml2 doesn't support
+ * default namespace for XPath expressions. Because there are not any API
+ * how to transform or access to parsed XPath expression we have to parse
+ * XPath here.
+ *
+ * Those functionalities are implemented with a simple XPath parser/
+ * preprocessor.  This XPath parser transforms a XPath expression to another
+ * XPath expression that can be used by libxml2 XPath evaluation. It doesn't
+ * replace libxml2 XPath parser or libxml2 XPath expression evaluation.
+ */
+
+#ifdef USE_LIBXML
+
+/*
+ * We need to work with XPath expression tokens.  When expression starting with
+ * nodename, then we can use prefix.  When default namespace is defined, then we
+ * should to enhance any nodename and attribute without namespace by default
+ * namespace.
+ */
+
+typedef enum
+{
+	XPATH_TOKEN_NONE,
+	XPATH_TOKEN_NAME,
+	XPATH_TOKEN_STRING,
+	XPATH_TOKEN_NUMBER,
+	XPATH_TOKEN_OTHER
+}	XPathTokenType;
+
+typedef struct XPathTokenInfo
+{
+	XPathTokenType ttype;
+	char	   *start;
+	int			length;
+}	XPathTokenInfo;
+
+#define TOKEN_STACK_SIZE		10
+
+typedef struct ParserData
+{
+	char	   *str;
+	char	   *cur;
+	XPathTokenInfo stack[TOKEN_STACK_SIZE];
+	int			stack_length;
+}	XPathParserData;
+
+/* Any high-bit-set character is OK (might be part of a multibyte char) */
+#define NODENAME_FIRSTCHAR(c)	 ((c) == '_' || (c) == '-' || \
+								 ((c) >= 'A' && (c) <= 'Z') || \
+								 ((c) >= 'a' && (c) <= 'z') || \
+								 (IS_HIGHBIT_SET(c)))
+
+#define IS_NODENAME_CHAR(c)		(NODENAME_FIRSTCHAR(c) || (c) == '.' || \
+								 ((c) >= '0' && (c) <= '9'))
+
+
+/*
+ * Returns next char after last char of token - XPath lexer
+ */
+static char *
+getXPathToken(char *str, XPathTokenInfo * ti)
+{
+	/* skip initial spaces */
+	while (*str == ' ')
+		str++;
+
+	if (*str != '\0')
+	{
+		char		c = *str;
+
+		ti->start = str++;
+
+		if (c >= '0' && c <= '9')
+		{
+			while (*str >= '0' && *str <= '9')
+				str++;
+			if (*str == '.')
+			{
+				str++;
+				while (*str >= '0' && *str <= '9')
+					str++;
+			}
+			ti->ttype = XPATH_TOKEN_NUMBER;
+		}
+		else if (NODENAME_FIRSTCHAR(c))
+		{
+			while (IS_NODENAME_CHAR(*str))
+				str++;
+
+			ti->ttype = XPATH_TOKEN_NAME;
+		}
+		else if (c == '"')
+		{
+			while (*str != '\0')
+				if (*str++ == '"')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else
+			ti->ttype = XPATH_TOKEN_OTHER;
+
+		ti->length = str - ti->start;
+	}
+	else
+	{
+		ti->start = NULL;
+		ti->length = 0;
+
+		ti->ttype = XPATH_TOKEN_NONE;
+	}
+
+	return str;
+}
+
+/*
+ * reset XPath parser stack
+ */
+static void
+initXPathParser(XPathParserData * parser, char *str)
+{
+	parser->str = str;
+	parser->cur = str;
+	parser->stack_length = 0;
+}
+
+/*
+ * Returns token from stack or read token
+ */
+static void
+nextXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (parser->stack_length > 0)
+		memcpy(ti, &parser->stack[--parser->stack_length],
+			   sizeof(XPathTokenInfo));
+	else
+		parser->cur = getXPathToken(parser->cur, ti);
+}
+
+/*
+ * Push token to stack
+ */
+static void
+pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (parser->stack_length == TOKEN_STACK_SIZE)
+		elog(ERROR, "internal error");
+	memcpy(&parser->stack[parser->stack_length++], ti,
+		   sizeof(XPathTokenInfo));
+}
+
+/*
+ * Write token to output string
+ */
+static void
+writeXPathToken(StringInfo str, XPathTokenInfo * ti)
+{
+	Assert(ti->ttype != XPATH_TOKEN_NONE);
+
+	if (ti->ttype != XPATH_TOKEN_OTHER)
+		appendBinaryStringInfo(str, ti->start, ti->length);
+	else
+		appendStringInfoChar(str, *ti->start);
+}
+
+/*
+ * This is main part of XPath transformation. It can be called recursivly,
+ * when XPath expression contains predicates.
+ */
+static void
+_transformXPath(StringInfo str, XPathParserData * parser,
+				bool inside_predicate,
+				char *def_namespace_name)
+{
+	XPathTokenInfo t1,
+				t2;
+	bool		last_token_is_name = false;
+
+	nextXPathToken(parser, &t1);
+
+	while (t1.ttype != XPATH_TOKEN_NONE)
+	{
+		switch (t1.ttype)
+		{
+			case XPATH_TOKEN_NUMBER:
+			case XPATH_TOKEN_STRING:
+				last_token_is_name = false;
+				writeXPathToken(str, &t1);
+				nextXPathToken(parser, &t1);
+				break;
+
+			case XPATH_TOKEN_NAME:
+				{
+					bool		is_qual_name = false;
+
+					/* inside predicate ignore keywords "and" "or" */
+					if (inside_predicate)
+					{
+						if ((strncmp(t1.start, "and", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "or", 2) == 0 && t1.length == 2))
+						{
+							writeXPathToken(str, &t1);
+							nextXPathToken(parser, &t1);
+							break;
+						}
+					}
+
+					last_token_is_name = true;
+					nextXPathToken(parser, &t2);
+					if (t2.ttype == XPATH_TOKEN_OTHER)
+					{
+						if (*t2.start == '(')
+							last_token_is_name = false;
+						else if (*t2.start == ':')
+							is_qual_name = true;
+					}
+
+					if (last_token_is_name && !is_qual_name && def_namespace_name != NULL)
+						appendStringInfo(str, "%s:", def_namespace_name);
+
+					writeXPathToken(str, &t1);
+
+					if (is_qual_name)
+					{
+						writeXPathToken(str, &t2);
+						nextXPathToken(parser, &t1);
+						if (t1.ttype == XPATH_TOKEN_NAME)
+							writeXPathToken(str, &t1);
+						else
+							pushXPathToken(parser, &t1);
+					}
+					else
+						pushXPathToken(parser, &t2);
+
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_OTHER:
+				{
+					char		c = *t1.start;
+
+					writeXPathToken(str, &t1);
+
+					if (c == '[')
+						_transformXPath(str, parser, true, def_namespace_name);
+					else
+					{
+						last_token_is_name = false;
+
+						if (c == ']' && inside_predicate)
+							return;
+
+						else if (c == '@')
+						{
+							nextXPathToken(parser, &t1);
+							if (t1.ttype == XPATH_TOKEN_NAME)
+							{
+								bool		is_qual_name = false;
+
+								/*
+								 * A default namespace declaration applies to all
+								 * unprefixed element names within its scope. Default
+								 * namespace declarations do not apply directly to
+								 * attribute names; the interpretation of unprefixed
+								 * attributes is determined by the element on which
+								 * they appear.
+								 */
+								nextXPathToken(parser, &t2);
+								if (t2.ttype == XPATH_TOKEN_OTHER && *t2.start == ':')
+									is_qual_name = true;
+
+								writeXPathToken(str, &t1);
+								if (is_qual_name)
+								{
+									writeXPathToken(str, &t2);
+									nextXPathToken(parser, &t1);
+									if (t1.ttype == XPATH_TOKEN_NAME)
+										writeXPathToken(str, &t1);
+									else
+										pushXPathToken(parser, &t1);
+								}
+								else
+									pushXPathToken(parser, &t2);
+							}
+							else
+								pushXPathToken(parser, &t1);
+						}
+					}
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_NONE:
+				elog(ERROR, "should not be here");
+		}
+	}
+}
+
+void
+transformXPath(StringInfo str, char *xpath,
+			   char *def_namespace_name)
+{
+	XPathParserData parser;
+
+	initStringInfo(str);
+	initXPathParser(&parser, xpath);
+	_transformXPath(str, &parser, false, def_namespace_name);
+}
+
+#endif
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index bcc585d427..b069286423 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1085,7 +1085,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1452,3 +1456,22 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index eb4687fb09..b18d1b5eab 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -558,3 +558,13 @@ INSERT INTO xmltest2 VALUES('<d><r><dc>2</dc></r></d>', 'D');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
+
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
#4Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#3)
1 attachment(s)
Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi

now xpath and xpath_exists supports default namespace too

updated doc,
fixed all variants of expected result test file

Regards

Pavel

Attachments:

xml-xpath-default-ns-3.patchtext/x-patch; charset=US-ASCII; name=xml-xpath-default-ns-3.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2f036015cc..610f709933 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10477,7 +10477,8 @@ SELECT xml_is_well_formed_document('<pg:foo xmlns:pg="http://postgresql.org/stuf
      second the namespace URI. It is not required that aliases provided in
      this array be the same as those being used in the XML document itself (in
      other words, both in the XML document and in the <function>xpath</function>
-     function context, aliases are <emphasis>local</>).
+     function context, aliases are <emphasis>local</>). Default namespace has
+     empty name (empty string) and should be only one.
     </para>
 
     <para>
@@ -10496,8 +10497,8 @@ SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>',
     <para>
      To deal with default (anonymous) namespaces, do something like this:
 <screen><![CDATA[
-SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
-             ARRAY[ARRAY['mydefns', 'http://example.com']]);
+SELECT xpath('//b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
+             ARRAY[ARRAY['', 'http://example.com']]);
 
  xpath
 --------
@@ -10571,8 +10572,7 @@ SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</m
     <para>
      The optional <literal>XMLNAMESPACES</> clause is a comma-separated
      list of namespaces.  It specifies the XML namespaces used in
-     the document and their aliases. A default namespace specification
-     is not currently supported.
+     the document and their aliases.
     </para>
 
     <para>
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 1fb018416e..b60a3cfe0d 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
 	tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
 	tsvector.o tsvector_op.o tsvector_parser.o \
 	txid.o uuid.o varbit.o varchar.o varlena.o version.o \
-	windowfuncs.o xid.o xml.o
+	windowfuncs.o xid.o xml.o xpath_parser.o
 
 like.o: like.c like_match.c
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 24229c2dff..90c51ea1c6 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -90,7 +90,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/xml.h"
-
+#include "utils/xpath_parser.h"
 
 /* GUC variables */
 int			xmlbinary;
@@ -187,6 +187,7 @@ typedef struct XmlTableBuilderData
 	xmlXPathCompExprPtr xpathcomp;
 	xmlXPathObjectPtr xpathobj;
 	xmlXPathCompExprPtr *xpathscomp;
+	bool		with_default_ns;
 } XmlTableBuilderData;
 #endif
 
@@ -227,6 +228,7 @@ const TableFuncRoutine XmlTableRoutine =
 #define NAMESPACE_XSI "http://www.w3.org/2001/XMLSchema-instance"
 #define NAMESPACE_SQLXML "http://standards.iso.org/iso/9075/2003/sqlxml"
 
+#define DEFAULT_NAMESPACE_NAME		"pgdefnamespace.pgsqlxml.internal"
 
 #ifdef USE_LIBXML
 
@@ -3849,6 +3851,7 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 	int			ndim;
 	Datum	   *ns_names_uris;
 	bool	   *ns_names_uris_nulls;
+	bool		with_default_ns = false;
 	int			ns_count;
 
 	/*
@@ -3898,7 +3901,6 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 				 errmsg("empty XPath expression")));
 
 	string = pg_xmlCharStrndup(datastr, len);
-	xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -3941,6 +3943,26 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 							(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
 							 errmsg("neither namespace name nor URI may be null")));
 				ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+
+				/* Don't allow same namespace as out internal default namespace name */
+				if (strcmp(ns_name, DEFAULT_NAMESPACE_NAME) == 0)
+					ereport(ERROR,
+								(errcode(ERRCODE_RESERVED_NAME),
+								 errmsg("cannot to use \"%s\" as namespace name",
+										  DEFAULT_NAMESPACE_NAME),
+								 errdetail("\"%s\" is reserved for internal purpose",
+										  DEFAULT_NAMESPACE_NAME)));
+				if (*ns_name == '\0')
+				{
+					if (with_default_ns)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("only one default namespace is allowed")));
+
+					with_default_ns = true;
+					ns_name = DEFAULT_NAMESPACE_NAME;
+				}
+
 				ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
 				if (xmlXPathRegisterNs(xpathctx,
 									   (xmlChar *) ns_name,
@@ -3951,6 +3973,16 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 			}
 		}
 
+		if (with_default_ns)
+		{
+			StringInfoData		str;
+
+			transformXPath(&str, text_to_cstring(xpath_expr_text), DEFAULT_NAMESPACE_NAME);
+			xpath_expr = pg_xmlCharStrndup(str.data, str.len);
+		}
+		else
+			xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
+
 		xpathcomp = xmlXPathCompile(xpath_expr);
 		if (xpathcomp == NULL || xmlerrcxt->err_occurred)
 			xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
@@ -4195,6 +4227,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts)
 	xtCxt->magic = XMLTABLE_CONTEXT_MAGIC;
 	xtCxt->natts = natts;
 	xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts);
+	xtCxt->with_default_ns = false;
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -4287,6 +4320,7 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value)
 #endif							/* not USE_LIBXML */
 }
 
+
 /*
  * XmlTableSetNamespace
  *		Add a namespace declaration
@@ -4297,12 +4331,25 @@ XmlTableSetNamespace(TableFuncScanState *state, char *name, char *uri)
 #ifdef USE_LIBXML
 	XmlTableBuilderData *xtCxt;
 
-	if (name == NULL)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DEFAULT namespace is not supported")));
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace");
 
+	if (name != NULL)
+	{
+		/* Don't allow same namespace as out internal default namespace name */
+		if (strcmp(name, DEFAULT_NAMESPACE_NAME) == 0)
+			ereport(ERROR,
+						(errcode(ERRCODE_RESERVED_NAME),
+						 errmsg("cannot to use \"%s\" as namespace name",
+								  DEFAULT_NAMESPACE_NAME),
+						 errdetail("\"%s\" is reserved for internal purpose",
+								  DEFAULT_NAMESPACE_NAME)));
+	}
+	else
+	{
+		xtCxt->with_default_ns = true;
+		name = DEFAULT_NAMESPACE_NAME;
+	}
+
 	if (xmlXPathRegisterNs(xtCxt->xpathcxt,
 						   pg_xmlCharStrndup(name, strlen(name)),
 						   pg_xmlCharStrndup(uri, strlen(uri))))
@@ -4331,6 +4378,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, char *path)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("row path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathcomp = xmlXPathCompile(xstr);
@@ -4362,6 +4417,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, char *path, int colnum)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("column path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr);
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
new file mode 100644
index 0000000000..ed5a071a0a
--- /dev/null
+++ b/src/backend/utils/adt/xpath_parser.c
@@ -0,0 +1,328 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.c
+ *	  XML XPath parser.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/backend/utils/adt/xpath_parser.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "utils/xpath_parser.h"
+
+/*
+ * All PostgreSQL XML related functionality is based on libxml2 library, and
+ * XPath support is not an exception.  However, libxml2 doesn't support
+ * default namespace for XPath expressions. Because there are not any API
+ * how to transform or access to parsed XPath expression we have to parse
+ * XPath here.
+ *
+ * Those functionalities are implemented with a simple XPath parser/
+ * preprocessor.  This XPath parser transforms a XPath expression to another
+ * XPath expression that can be used by libxml2 XPath evaluation. It doesn't
+ * replace libxml2 XPath parser or libxml2 XPath expression evaluation.
+ */
+
+#ifdef USE_LIBXML
+
+/*
+ * We need to work with XPath expression tokens.  When expression starting with
+ * nodename, then we can use prefix.  When default namespace is defined, then we
+ * should to enhance any nodename and attribute without namespace by default
+ * namespace.
+ */
+
+typedef enum
+{
+	XPATH_TOKEN_NONE,
+	XPATH_TOKEN_NAME,
+	XPATH_TOKEN_STRING,
+	XPATH_TOKEN_NUMBER,
+	XPATH_TOKEN_OTHER
+}	XPathTokenType;
+
+typedef struct XPathTokenInfo
+{
+	XPathTokenType ttype;
+	char	   *start;
+	int			length;
+}	XPathTokenInfo;
+
+#define TOKEN_STACK_SIZE		10
+
+typedef struct ParserData
+{
+	char	   *str;
+	char	   *cur;
+	XPathTokenInfo stack[TOKEN_STACK_SIZE];
+	int			stack_length;
+}	XPathParserData;
+
+/* Any high-bit-set character is OK (might be part of a multibyte char) */
+#define NODENAME_FIRSTCHAR(c)	 ((c) == '_' || (c) == '-' || \
+								 ((c) >= 'A' && (c) <= 'Z') || \
+								 ((c) >= 'a' && (c) <= 'z') || \
+								 (IS_HIGHBIT_SET(c)))
+
+#define IS_NODENAME_CHAR(c)		(NODENAME_FIRSTCHAR(c) || (c) == '.' || \
+								 ((c) >= '0' && (c) <= '9'))
+
+
+/*
+ * Returns next char after last char of token - XPath lexer
+ */
+static char *
+getXPathToken(char *str, XPathTokenInfo * ti)
+{
+	/* skip initial spaces */
+	while (*str == ' ')
+		str++;
+
+	if (*str != '\0')
+	{
+		char		c = *str;
+
+		ti->start = str++;
+
+		if (c >= '0' && c <= '9')
+		{
+			while (*str >= '0' && *str <= '9')
+				str++;
+			if (*str == '.')
+			{
+				str++;
+				while (*str >= '0' && *str <= '9')
+					str++;
+			}
+			ti->ttype = XPATH_TOKEN_NUMBER;
+		}
+		else if (NODENAME_FIRSTCHAR(c))
+		{
+			while (IS_NODENAME_CHAR(*str))
+				str++;
+
+			ti->ttype = XPATH_TOKEN_NAME;
+		}
+		else if (c == '"')
+		{
+			while (*str != '\0')
+				if (*str++ == '"')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else
+			ti->ttype = XPATH_TOKEN_OTHER;
+
+		ti->length = str - ti->start;
+	}
+	else
+	{
+		ti->start = NULL;
+		ti->length = 0;
+
+		ti->ttype = XPATH_TOKEN_NONE;
+	}
+
+	return str;
+}
+
+/*
+ * reset XPath parser stack
+ */
+static void
+initXPathParser(XPathParserData * parser, char *str)
+{
+	parser->str = str;
+	parser->cur = str;
+	parser->stack_length = 0;
+}
+
+/*
+ * Returns token from stack or read token
+ */
+static void
+nextXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (parser->stack_length > 0)
+		memcpy(ti, &parser->stack[--parser->stack_length],
+			   sizeof(XPathTokenInfo));
+	else
+		parser->cur = getXPathToken(parser->cur, ti);
+}
+
+/*
+ * Push token to stack
+ */
+static void
+pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (parser->stack_length == TOKEN_STACK_SIZE)
+		elog(ERROR, "internal error");
+	memcpy(&parser->stack[parser->stack_length++], ti,
+		   sizeof(XPathTokenInfo));
+}
+
+/*
+ * Write token to output string
+ */
+static void
+writeXPathToken(StringInfo str, XPathTokenInfo * ti)
+{
+	Assert(ti->ttype != XPATH_TOKEN_NONE);
+
+	if (ti->ttype != XPATH_TOKEN_OTHER)
+		appendBinaryStringInfo(str, ti->start, ti->length);
+	else
+		appendStringInfoChar(str, *ti->start);
+}
+
+/*
+ * This is main part of XPath transformation. It can be called recursivly,
+ * when XPath expression contains predicates.
+ */
+static void
+_transformXPath(StringInfo str, XPathParserData * parser,
+				bool inside_predicate,
+				char *def_namespace_name)
+{
+	XPathTokenInfo t1,
+				t2;
+	bool		last_token_is_name = false;
+
+	nextXPathToken(parser, &t1);
+
+	while (t1.ttype != XPATH_TOKEN_NONE)
+	{
+		switch (t1.ttype)
+		{
+			case XPATH_TOKEN_NUMBER:
+			case XPATH_TOKEN_STRING:
+				last_token_is_name = false;
+				writeXPathToken(str, &t1);
+				nextXPathToken(parser, &t1);
+				break;
+
+			case XPATH_TOKEN_NAME:
+				{
+					bool		is_qual_name = false;
+
+					/* inside predicate ignore keywords "and" "or" */
+					if (inside_predicate)
+					{
+						if ((strncmp(t1.start, "and", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "or", 2) == 0 && t1.length == 2))
+						{
+							writeXPathToken(str, &t1);
+							nextXPathToken(parser, &t1);
+							break;
+						}
+					}
+
+					last_token_is_name = true;
+					nextXPathToken(parser, &t2);
+					if (t2.ttype == XPATH_TOKEN_OTHER)
+					{
+						if (*t2.start == '(')
+							last_token_is_name = false;
+						else if (*t2.start == ':')
+							is_qual_name = true;
+					}
+
+					if (last_token_is_name && !is_qual_name && def_namespace_name != NULL)
+						appendStringInfo(str, "%s:", def_namespace_name);
+
+					writeXPathToken(str, &t1);
+
+					if (is_qual_name)
+					{
+						writeXPathToken(str, &t2);
+						nextXPathToken(parser, &t1);
+						if (t1.ttype == XPATH_TOKEN_NAME)
+							writeXPathToken(str, &t1);
+						else
+							pushXPathToken(parser, &t1);
+					}
+					else
+						pushXPathToken(parser, &t2);
+
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_OTHER:
+				{
+					char		c = *t1.start;
+
+					writeXPathToken(str, &t1);
+
+					if (c == '[')
+						_transformXPath(str, parser, true, def_namespace_name);
+					else
+					{
+						last_token_is_name = false;
+
+						if (c == ']' && inside_predicate)
+							return;
+
+						else if (c == '@')
+						{
+							nextXPathToken(parser, &t1);
+							if (t1.ttype == XPATH_TOKEN_NAME)
+							{
+								bool		is_qual_name = false;
+
+								/*
+								 * A default namespace declaration applies to all
+								 * unprefixed element names within its scope. Default
+								 * namespace declarations do not apply directly to
+								 * attribute names; the interpretation of unprefixed
+								 * attributes is determined by the element on which
+								 * they appear.
+								 */
+								nextXPathToken(parser, &t2);
+								if (t2.ttype == XPATH_TOKEN_OTHER && *t2.start == ':')
+									is_qual_name = true;
+
+								writeXPathToken(str, &t1);
+								if (is_qual_name)
+								{
+									writeXPathToken(str, &t2);
+									nextXPathToken(parser, &t1);
+									if (t1.ttype == XPATH_TOKEN_NAME)
+										writeXPathToken(str, &t1);
+									else
+										pushXPathToken(parser, &t1);
+								}
+								else
+									pushXPathToken(parser, &t2);
+							}
+							else
+								pushXPathToken(parser, &t1);
+						}
+					}
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_NONE:
+				elog(ERROR, "should not be here");
+		}
+	}
+}
+
+void
+transformXPath(StringInfo str, char *xpath,
+			   char *def_namespace_name)
+{
+	XPathParserData parser;
+
+	initStringInfo(str);
+	initXPathParser(&parser, xpath);
+	_transformXPath(str, &parser, false, def_namespace_name);
+}
+
+#endif
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index bcc585d427..828d9b8352 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1085,7 +1085,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1452,3 +1456,50 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index d3bd8c91d7..58f9151788 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -1302,3 +1302,59 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
 ---
 (0 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+ERROR:  unsupported XML feature
+LINE 1: INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a ...
+                                  ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="ht...
+                                                    ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x...
+                                              ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: ...ELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xml...
+                                                             ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index ff77132803..0727a9df0a 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -1065,7 +1065,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1432,3 +1436,50 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index eb4687fb09..8381be34b1 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -558,3 +558,22 @@ INSERT INTO xmltest2 VALUES('<d><r><dc>2</dc></r></d>', 'D');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
+
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
#5Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Kyotaro HORIGUCHI (#2)
Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi, thanks for the new patch.

# The patch is missing xpath_parser.h. That of the first patch was usable.

At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule <pavel.stehule@gmail.com> wrote in <CAFj8pRBMQa07a=+qQAVMtz5M_hqkJBhiQSOP76+-BrFDj37pvg@mail.gmail.com>

Hi

now xpath and xpath_exists supports default namespace too

At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule <pavel.stehule@gmail.com> wrote in <CAFj8pRCZ8oneG7g2vxs9ux71n8A9twwUO7zQpJiuz+7RGSpSuw@mail.gmail.com>

1. Uniformity among simliar features

As mentioned in the proposal, but it is lack of uniformity that
the xpath transformer is applied only to xmltable and not for
other xpath related functions.

I have to fix the XPath function. The SQL/XML function Xmlexists doesn't
support namespaces/

Sorry, I forgot to care about that. (And the definition of
namespace array is of course fabricated by me). I'd like to leave
this to committers. Anyway it is working but the syntax (or
whether it is acceptable) is still arguable.

SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com&quot;&gt;test&lt;/my:a&gt;&#39;,
ARRAY[ARRAY['', 'http://example.com&#39;]]);
| xpath
| --------
| {test}
| (1 row)

The internal name is properly rejected, but the current internal
name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We
are preserving some short names and reject them as
user-defined. Doesn't just 'pgsqlxml' work?

Default namespace correctly become to be applied on bare
attribute names.

updated doc,
fixed all variants of expected result test file

Sorry for one by one comment but I found another misbehavior.

create table t1 (id int, doc xml);
insert into t1
values
(5, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a hoge="haha">50</a></row></rows>');
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; AS x), '/x:rows/x:row' passing t1.doc columns data int PATH 'child::x:a[1][attribute::hoge="haha"]') as x;
| data
| ------
| 50

but the following fails.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;), '/rows/row' passing t1.doc columns data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
| data
| ------
|
| (1 row)

Perhaps child::a is not prefixed by the transformation.

XPath might be complex enough so that it's worth switching to
yacc/lex based transformer that is formally verifiable and won't
need a bunch of cryptic tests that finally cannot prove the
completeness. synchronous_standy_names is far simpler than XPath
but using yacc/lex parser.

Anyway the following is nitpicking of the current xpath_parser.c.

- NODENAME_FIRSTCHAR allows '-' as the first char but it is
excluded from NameStartChar (https://www.w3.org/TR/REC-xml/#NT-NameStartChar)
I think characters with high-bit set is okay.
Also IS_NODENAME_CHAR should be changed.

- NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category
but have different naming schemes. Can these are named in the same way?

- The current transoformer seems to using up to one token stack
depth. Maybe the stack is needless. (pushed token is always
popped just after)

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Pavel Stehule
pavel.stehule@gmail.com
In reply to: Kyotaro HORIGUCHI (#5)
Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp>:

Hi, thanks for the new patch.

# The patch is missing xpath_parser.h. That of the first patch was usable.

At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRBMQa07a=+qQAVMtz5M_hqkJBhiQSOP76+-BrFDj37pvg@
mail.gmail.com>

Hi

now xpath and xpath_exists supports default namespace too

At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRCZ8oneG7g2vxs9ux71n8A9twwUO7zQpJiuz+7RGSpSuw@mail.
gmail.com>

1. Uniformity among simliar features

As mentioned in the proposal, but it is lack of uniformity that
the xpath transformer is applied only to xmltable and not for
other xpath related functions.

I have to fix the XPath function. The SQL/XML function Xmlexists doesn't
support namespaces/

Sorry, I forgot to care about that. (And the definition of
namespace array is of course fabricated by me). I'd like to leave
this to committers. Anyway it is working but the syntax (or
whether it is acceptable) is still arguable.

SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com&quot;&gt;
test</my:a>',
ARRAY[ARRAY['', 'http://example.com&#39;]]);
| xpath
| --------
| {test}
| (1 row)

The internal name is properly rejected, but the current internal
name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We
are preserving some short names and reject them as
user-defined. Doesn't just 'pgsqlxml' work?

LibXML2 does trim to 100 bytes length names. So
pgdefnamespace.pgsqlxml.internal
is safe from this perspective.

I would to decraese a risk of possible collision, so longer string is
better. Maybe "pgsqlxml.internal" is good enoug - I have not a idea. But if
somewhere will be this string printed, then
"pgdefnamespace.pgsqlxml.internal" has clean semantic, and it is reason,
why I prefer this string. PostgreSQL uses 63 bytes names - and this string
is correct too.

Default namespace correctly become to be applied on bare
attribute names.

updated doc,
fixed all variants of expected result test file

Sorry for one by one comment but I found another misbehavior.

create table t1 (id int, doc xml);
insert into t1
values
(5, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a hoge="haha">50</a></row></
rows>');
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; AS x),
'/x:rows/x:row' passing t1.doc columns data int PATH
'child::x:a[1][attribute::hoge="haha"]') as x;
| data
| ------
| 50

but the following fails.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' passing t1.doc columns data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
| data
| ------
|
| (1 row)

Perhaps child::a is not prefixed by the transformation.

XPath might be complex enough so that it's worth switching to
yacc/lex based transformer that is formally verifiable and won't
need a bunch of cryptic tests that finally cannot prove the
completeness. synchronous_standy_names is far simpler than XPath
but using yacc/lex parser.

I don't think (not yet) - it is simple state machine now, and when the code
will be stable, then will not be modified.

Thank you for comments, I'll look on it

Regards

Pavel

Show quoted text

Anyway the following is nitpicking of the current xpath_parser.c.

- NODENAME_FIRSTCHAR allows '-' as the first char but it is
excluded from NameStartChar (https://www.w3.org/TR/REC-
xml/#NT-NameStartChar)
I think characters with high-bit set is okay.
Also IS_NODENAME_CHAR should be changed.

- NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category
but have different naming schemes. Can these are named in the same way?

- The current transoformer seems to using up to one token stack
depth. Maybe the stack is needless. (pushed token is always
popped just after)

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#7Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Pavel Stehule (#6)
Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

At Mon, 2 Oct 2017 12:43:19 +0200, Pavel Stehule <pavel.stehule@gmail.com> wrote in <CAFj8pRCD8=AzbRJQ2_2rp3+uzade9GHbm0DAF3-t__yVtzo2cA@mail.gmail.com>

Sorry, I forgot to care about that. (And the definition of
namespace array is of course fabricated by me). I'd like to leave
this to committers. Anyway it is working but the syntax (or
whether it is acceptable) is still arguable.

SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com&quot;&gt;
test</my:a>',
ARRAY[ARRAY['', 'http://example.com&#39;]]);
| xpath
| --------
| {test}
| (1 row)

The internal name is properly rejected, but the current internal
name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We
are preserving some short names and reject them as
user-defined. Doesn't just 'pgsqlxml' work?

LibXML2 does trim to 100 bytes length names. So
pgdefnamespace.pgsqlxml.internal
is safe from this perspective.

I would to decraese a risk of possible collision, so longer string is
better. Maybe "pgsqlxml.internal" is good enoug - I have not a idea. But if
somewhere will be this string printed, then
"pgdefnamespace.pgsqlxml.internal" has clean semantic, and it is reason,
why I prefer this string. PostgreSQL uses 63 bytes names - and this string
is correct too.

Ok, I'm fine with that.

Default namespace correctly become to be applied on bare
attribute names.

updated doc,
fixed all variants of expected result test file

Sorry for one by one comment but I found another misbehavior.

create table t1 (id int, doc xml);
insert into t1
values
(5, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a hoge="haha">50</a></row></
rows>');
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; AS x),
'/x:rows/x:row' passing t1.doc columns data int PATH
'child::x:a[1][attribute::hoge="haha"]') as x;
| data
| ------
| 50

but the following fails.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' passing t1.doc columns data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
| data
| ------
|
| (1 row)

Perhaps child::a is not prefixed by the transformation.

XPath might be complex enough so that it's worth switching to
yacc/lex based transformer that is formally verifiable and won't
need a bunch of cryptic tests that finally cannot prove the
completeness. synchronous_standy_names is far simpler than XPath
but using yacc/lex parser.

I don't think (not yet) - it is simple state machine now, and when the code
will be stable, then will not be modified.

Hmm. Ok, agreed. I didn't mean the current shape ought to be
changed.

Thank you for comments, I'll look on it

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Pavel Stehule
pavel.stehule@gmail.com
In reply to: Kyotaro HORIGUCHI (#5)
1 attachment(s)
Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp>:

Hi, thanks for the new patch.

# The patch is missing xpath_parser.h. That of the first patch was usable.

At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRBMQa07a=+qQAVMtz5M_hqkJBhiQSOP76+-BrFDj37pvg@
mail.gmail.com>

Hi

now xpath and xpath_exists supports default namespace too

At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRCZ8oneG7g2vxs9ux71n8A9twwUO7zQpJiuz+7RGSpSuw@mail.
gmail.com>

1. Uniformity among simliar features

As mentioned in the proposal, but it is lack of uniformity that
the xpath transformer is applied only to xmltable and not for
other xpath related functions.

I have to fix the XPath function. The SQL/XML function Xmlexists doesn't
support namespaces/

Sorry, I forgot to care about that. (And the definition of
namespace array is of course fabricated by me). I'd like to leave
this to committers. Anyway it is working but the syntax (or
whether it is acceptable) is still arguable.

SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com&quot;&gt;
test</my:a>',
ARRAY[ARRAY['', 'http://example.com&#39;]]);
| xpath
| --------
| {test}
| (1 row)

The internal name is properly rejected, but the current internal
name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We
are preserving some short names and reject them as
user-defined. Doesn't just 'pgsqlxml' work?

Default namespace correctly become to be applied on bare
attribute names.

updated doc,
fixed all variants of expected result test file

Sorry for one by one comment but I found another misbehavior.

create table t1 (id int, doc xml);
insert into t1
values
(5, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a hoge="haha">50</a></row></
rows>');
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; AS x),
'/x:rows/x:row' passing t1.doc columns data int PATH
'child::x:a[1][attribute::hoge="haha"]') as x;
| data
| ------
| 50

but the following fails.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' passing t1.doc columns data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
| data
| ------
|
| (1 row)

Perhaps child::a is not prefixed by the transformation.

the problem was in unwanted attribute modification. The parser didn't
detect "attribute::hoge" as attribute. Updated parser does it. I reduce
duplicated code there more.

XPath might be complex enough so that it's worth switching to
yacc/lex based transformer that is formally verifiable and won't
need a bunch of cryptic tests that finally cannot prove the
completeness. synchronous_standy_names is far simpler than XPath
but using yacc/lex parser.

Anyway the following is nitpicking of the current xpath_parser.c.

- NODENAME_FIRSTCHAR allows '-' as the first char but it is
excluded from NameStartChar (https://www.w3.org/TR/REC-
xml/#NT-NameStartChar)
I think characters with high-bit set is okay.
Also IS_NODENAME_CHAR should be changed.

fixed

- NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category
but have different naming schemes. Can these are named in the same way?

fixed

- The current transoformer seems to using up to one token stack
depth. Maybe the stack is needless. (pushed token is always
popped just after)

fixed

Regards

Pavel

Show quoted text

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

xml-xpath-default-ns-4.patchtext/x-patch; charset=US-ASCII; name=xml-xpath-default-ns-4.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b52407822d..af72a07326 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10468,7 +10468,8 @@ SELECT xml_is_well_formed_document('<pg:foo xmlns:pg="http://postgresql.org/stuf
      second the namespace URI. It is not required that aliases provided in
      this array be the same as those being used in the XML document itself (in
      other words, both in the XML document and in the <function>xpath</function>
-     function context, aliases are <emphasis>local</>).
+     function context, aliases are <emphasis>local</>). Default namespace has
+     empty name (empty string) and should be only one.
     </para>
 
     <para>
@@ -10487,8 +10488,8 @@ SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>',
     <para>
      To deal with default (anonymous) namespaces, do something like this:
 <screen><![CDATA[
-SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
-             ARRAY[ARRAY['mydefns', 'http://example.com']]);
+SELECT xpath('//b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
+             ARRAY[ARRAY['', 'http://example.com']]);
 
  xpath
 --------
@@ -10562,8 +10563,7 @@ SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</m
     <para>
      The optional <literal>XMLNAMESPACES</> clause is a comma-separated
      list of namespaces.  It specifies the XML namespaces used in
-     the document and their aliases. A default namespace specification
-     is not currently supported.
+     the document and their aliases.
     </para>
 
     <para>
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 1fb018416e..b60a3cfe0d 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
 	tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
 	tsvector.o tsvector_op.o tsvector_parser.o \
 	txid.o uuid.o varbit.o varchar.o varlena.o version.o \
-	windowfuncs.o xid.o xml.o
+	windowfuncs.o xid.o xml.o xpath_parser.o
 
 like.o: like.c like_match.c
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 24229c2dff..90c51ea1c6 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -90,7 +90,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/xml.h"
-
+#include "utils/xpath_parser.h"
 
 /* GUC variables */
 int			xmlbinary;
@@ -187,6 +187,7 @@ typedef struct XmlTableBuilderData
 	xmlXPathCompExprPtr xpathcomp;
 	xmlXPathObjectPtr xpathobj;
 	xmlXPathCompExprPtr *xpathscomp;
+	bool		with_default_ns;
 } XmlTableBuilderData;
 #endif
 
@@ -227,6 +228,7 @@ const TableFuncRoutine XmlTableRoutine =
 #define NAMESPACE_XSI "http://www.w3.org/2001/XMLSchema-instance"
 #define NAMESPACE_SQLXML "http://standards.iso.org/iso/9075/2003/sqlxml"
 
+#define DEFAULT_NAMESPACE_NAME		"pgdefnamespace.pgsqlxml.internal"
 
 #ifdef USE_LIBXML
 
@@ -3849,6 +3851,7 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 	int			ndim;
 	Datum	   *ns_names_uris;
 	bool	   *ns_names_uris_nulls;
+	bool		with_default_ns = false;
 	int			ns_count;
 
 	/*
@@ -3898,7 +3901,6 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 				 errmsg("empty XPath expression")));
 
 	string = pg_xmlCharStrndup(datastr, len);
-	xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -3941,6 +3943,26 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 							(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
 							 errmsg("neither namespace name nor URI may be null")));
 				ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+
+				/* Don't allow same namespace as out internal default namespace name */
+				if (strcmp(ns_name, DEFAULT_NAMESPACE_NAME) == 0)
+					ereport(ERROR,
+								(errcode(ERRCODE_RESERVED_NAME),
+								 errmsg("cannot to use \"%s\" as namespace name",
+										  DEFAULT_NAMESPACE_NAME),
+								 errdetail("\"%s\" is reserved for internal purpose",
+										  DEFAULT_NAMESPACE_NAME)));
+				if (*ns_name == '\0')
+				{
+					if (with_default_ns)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("only one default namespace is allowed")));
+
+					with_default_ns = true;
+					ns_name = DEFAULT_NAMESPACE_NAME;
+				}
+
 				ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
 				if (xmlXPathRegisterNs(xpathctx,
 									   (xmlChar *) ns_name,
@@ -3951,6 +3973,16 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 			}
 		}
 
+		if (with_default_ns)
+		{
+			StringInfoData		str;
+
+			transformXPath(&str, text_to_cstring(xpath_expr_text), DEFAULT_NAMESPACE_NAME);
+			xpath_expr = pg_xmlCharStrndup(str.data, str.len);
+		}
+		else
+			xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
+
 		xpathcomp = xmlXPathCompile(xpath_expr);
 		if (xpathcomp == NULL || xmlerrcxt->err_occurred)
 			xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
@@ -4195,6 +4227,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts)
 	xtCxt->magic = XMLTABLE_CONTEXT_MAGIC;
 	xtCxt->natts = natts;
 	xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts);
+	xtCxt->with_default_ns = false;
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -4287,6 +4320,7 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value)
 #endif							/* not USE_LIBXML */
 }
 
+
 /*
  * XmlTableSetNamespace
  *		Add a namespace declaration
@@ -4297,12 +4331,25 @@ XmlTableSetNamespace(TableFuncScanState *state, char *name, char *uri)
 #ifdef USE_LIBXML
 	XmlTableBuilderData *xtCxt;
 
-	if (name == NULL)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DEFAULT namespace is not supported")));
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace");
 
+	if (name != NULL)
+	{
+		/* Don't allow same namespace as out internal default namespace name */
+		if (strcmp(name, DEFAULT_NAMESPACE_NAME) == 0)
+			ereport(ERROR,
+						(errcode(ERRCODE_RESERVED_NAME),
+						 errmsg("cannot to use \"%s\" as namespace name",
+								  DEFAULT_NAMESPACE_NAME),
+						 errdetail("\"%s\" is reserved for internal purpose",
+								  DEFAULT_NAMESPACE_NAME)));
+	}
+	else
+	{
+		xtCxt->with_default_ns = true;
+		name = DEFAULT_NAMESPACE_NAME;
+	}
+
 	if (xmlXPathRegisterNs(xtCxt->xpathcxt,
 						   pg_xmlCharStrndup(name, strlen(name)),
 						   pg_xmlCharStrndup(uri, strlen(uri))))
@@ -4331,6 +4378,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, char *path)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("row path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathcomp = xmlXPathCompile(xstr);
@@ -4362,6 +4417,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, char *path, int colnum)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("column path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr);
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
new file mode 100644
index 0000000000..f22ec7638b
--- /dev/null
+++ b/src/backend/utils/adt/xpath_parser.c
@@ -0,0 +1,327 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.c
+ *	  XML XPath parser.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/backend/utils/adt/xpath_parser.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "utils/xpath_parser.h"
+
+/*
+ * All PostgreSQL XML related functionality is based on libxml2 library, and
+ * XPath support is not an exception.  However, libxml2 doesn't support
+ * default namespace for XPath expressions. Because there are not any API
+ * how to transform or access to parsed XPath expression we have to parse
+ * XPath here.
+ *
+ * Those functionalities are implemented with a simple XPath parser/
+ * preprocessor.  This XPath parser transforms a XPath expression to another
+ * XPath expression that can be used by libxml2 XPath evaluation. It doesn't
+ * replace libxml2 XPath parser or libxml2 XPath expression evaluation.
+ */
+
+#ifdef USE_LIBXML
+
+/*
+ * We need to work with XPath expression tokens.  When expression starting with
+ * nodename, then we can use prefix.  When default namespace is defined, then we
+ * should to enhance any nodename and attribute without namespace by default
+ * namespace.
+ */
+
+typedef enum
+{
+	XPATH_TOKEN_NONE,
+	XPATH_TOKEN_NAME,
+	XPATH_TOKEN_STRING,
+	XPATH_TOKEN_NUMBER,
+	XPATH_TOKEN_OTHER
+}	XPathTokenType;
+
+typedef struct XPathTokenInfo
+{
+	XPathTokenType ttype;
+	char	   *start;
+	int			length;
+}	XPathTokenInfo;
+
+typedef struct ParserData
+{
+	char	   *str;
+	char	   *cur;
+	XPathTokenInfo buffer;
+	bool		buffer_is_empty;
+}	XPathParserData;
+
+/* Any high-bit-set character is OK (might be part of a multibyte char) */
+#define IS_NODENAME_FIRSTCHAR(c)	 ((c) == '_' || \
+								 ((c) >= 'A' && (c) <= 'Z') || \
+								 ((c) >= 'a' && (c) <= 'z') || \
+								 (IS_HIGHBIT_SET(c)))
+
+#define IS_NODENAME_CHAR(c)		(IS_NODENAME_FIRSTCHAR(c) || (c) == '-' || (c) == '.' || \
+								 ((c) >= '0' && (c) <= '9'))
+
+
+/*
+ * Returns next char after last char of token - XPath lexer
+ */
+static char *
+getXPathToken(char *str, XPathTokenInfo * ti)
+{
+	/* skip initial spaces */
+	while (*str == ' ')
+		str++;
+
+	if (*str != '\0')
+	{
+		char		c = *str;
+
+		ti->start = str++;
+
+		if (c >= '0' && c <= '9')
+		{
+			while (*str >= '0' && *str <= '9')
+				str++;
+			if (*str == '.')
+			{
+				str++;
+				while (*str >= '0' && *str <= '9')
+					str++;
+			}
+			ti->ttype = XPATH_TOKEN_NUMBER;
+		}
+		else if (IS_NODENAME_FIRSTCHAR(c))
+		{
+			while (IS_NODENAME_CHAR(*str))
+				str++;
+
+			ti->ttype = XPATH_TOKEN_NAME;
+		}
+		else if (c == '"')
+		{
+			while (*str != '\0')
+				if (*str++ == '"')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else
+			ti->ttype = XPATH_TOKEN_OTHER;
+
+		ti->length = str - ti->start;
+	}
+	else
+	{
+		ti->start = NULL;
+		ti->length = 0;
+
+		ti->ttype = XPATH_TOKEN_NONE;
+	}
+
+	return str;
+}
+
+/*
+ * reset XPath parser stack
+ */
+static void
+initXPathParser(XPathParserData * parser, char *str)
+{
+	parser->str = str;
+	parser->cur = str;
+	parser->buffer_is_empty = true;
+}
+
+/*
+ * Returns token from stack or read token
+ */
+static void
+nextXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+	{
+		memcpy(ti, &parser->buffer, sizeof(XPathTokenInfo));
+		parser->buffer_is_empty = true;
+	}
+	else
+		parser->cur = getXPathToken(parser->cur, ti);
+}
+
+/*
+ * Push token to stack
+ */
+static void
+pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+		elog(ERROR, "internal error");
+
+	memcpy(&parser->buffer, ti, sizeof(XPathTokenInfo));
+	parser->buffer_is_empty = false;
+}
+
+/*
+ * Write token to output string
+ */
+static void
+writeXPathToken(StringInfo str, XPathTokenInfo * ti)
+{
+	Assert(ti->ttype != XPATH_TOKEN_NONE);
+
+	if (ti->ttype != XPATH_TOKEN_OTHER)
+		appendBinaryStringInfo(str, ti->start, ti->length);
+	else
+		appendStringInfoChar(str, *ti->start);
+}
+
+/*
+ * This is main part of XPath transformation. It can be called recursivly,
+ * when XPath expression contains predicates.
+ */
+static void
+_transformXPath(StringInfo str, XPathParserData * parser,
+				bool inside_predicate,
+				char *def_namespace_name)
+{
+	XPathTokenInfo t1,
+				t2;
+	bool		token_is_tagname = false;
+	bool		token_is_tagattrib = false;
+
+	nextXPathToken(parser, &t1);
+
+	while (t1.ttype != XPATH_TOKEN_NONE)
+	{
+		switch (t1.ttype)
+		{
+			case XPATH_TOKEN_NUMBER:
+			case XPATH_TOKEN_STRING:
+				token_is_tagname = false;
+				token_is_tagattrib = false;
+
+				writeXPathToken(str, &t1);
+				nextXPathToken(parser, &t1);
+				break;
+
+			case XPATH_TOKEN_NAME:
+				{
+					bool		is_qual_name = false;
+
+					/* inside predicate ignore keywords "and" "or" */
+					if (inside_predicate)
+					{
+						if ((strncmp(t1.start, "and", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "or", 2) == 0 && t1.length == 2))
+						{
+							writeXPathToken(str, &t1);
+							nextXPathToken(parser, &t1);
+							break;
+						}
+					}
+
+					token_is_tagname = true;
+					nextXPathToken(parser, &t2);
+					if (t2.ttype == XPATH_TOKEN_OTHER)
+					{
+						if (*t2.start == '(')
+							token_is_tagname = false;
+						else if (*t2.start == ':')
+						{
+							XPathTokenInfo t3;
+
+							nextXPathToken(parser, &t3);
+							if (t3.ttype == XPATH_TOKEN_OTHER && *t3.start == ':'
+									 && strncmp(t1.start, "attribute", 9) == 0)
+							{
+								/* other syntax for attribute, where we should not apply def namespace */
+								appendStringInfo(str, "attribute::");
+								nextXPathToken(parser, &t1);
+								token_is_tagattrib = true;
+								break;
+							}
+							else
+							{
+								pushXPathToken(parser, &t3);
+								is_qual_name = true;
+							}
+						}
+					}
+
+					if (token_is_tagname && !token_is_tagattrib
+								 && !is_qual_name && def_namespace_name != NULL)
+						appendStringInfo(str, "%s:", def_namespace_name);
+
+					token_is_tagattrib = false;
+
+					writeXPathToken(str, &t1);
+
+					if (is_qual_name)
+					{
+						writeXPathToken(str, &t2);
+						nextXPathToken(parser, &t1);
+						if (t1.ttype == XPATH_TOKEN_NAME)
+							writeXPathToken(str, &t1);
+						else
+							pushXPathToken(parser, &t1);
+					}
+					else
+						pushXPathToken(parser, &t2);
+
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_OTHER:
+				{
+					char		c = *t1.start;
+
+					token_is_tagattrib = false;
+					token_is_tagname = false;
+
+					writeXPathToken(str, &t1);
+
+					if (c == '[')
+						_transformXPath(str, parser, true, def_namespace_name);
+					else
+					{
+						if (c == ']' && inside_predicate)
+							return;
+
+						else if (c == '@')
+						{
+							nextXPathToken(parser, &t1);
+							if (t1.ttype == XPATH_TOKEN_NAME)
+								token_is_tagattrib = true;
+
+							pushXPathToken(parser, &t1);
+						}
+					}
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_NONE:
+				elog(ERROR, "should not be here");
+		}
+	}
+}
+
+void
+transformXPath(StringInfo str, char *xpath,
+			   char *def_namespace_name)
+{
+	XPathParserData parser;
+
+	initStringInfo(str);
+	initXPathParser(&parser, xpath);
+	_transformXPath(str, &parser, false, def_namespace_name);
+}
+
+#endif
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index bcc585d427..63e04f1353 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1085,7 +1085,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1452,3 +1456,56 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index d3bd8c91d7..58f9151788 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -1302,3 +1302,59 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
 ---
 (0 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+ERROR:  unsupported XML feature
+LINE 1: INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a ...
+                                  ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="ht...
+                                                    ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x...
+                                              ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: ...ELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xml...
+                                                             ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index ff77132803..c92a09e5a9 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -1065,7 +1065,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1432,3 +1436,56 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index eb4687fb09..e8cff5f22d 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -558,3 +558,23 @@ INSERT INTO xmltest2 VALUES('<d><r><dc>2</dc></r></d>', 'D');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
+
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
#9Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Pavel Stehule (#8)
1 attachment(s)
Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Thank you for the new patch.

- The latest patch is missing xpath_parser.h at least since
ns-3. That of the first (not-numbered) version was still
usable.

- c29c578 conflicts on doc/src/sgml/func.sgml

At Sun, 15 Oct 2017 12:06:11 +0200, Pavel Stehule <pavel.stehule@gmail.com> wrote in <CAFj8pRCYBH+a6oJoEYUFDUpBQ1ySwtt2CfnFZxs2Ab9EfONbUQ@mail.gmail.com>

2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp>:

Hi, thanks for the new patch.

# The patch is missing xpath_parser.h. That of the first patch was usable.

At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRBMQa07a=+qQAVMtz5M_hqkJBhiQSOP76+-BrFDj37pvg@
mail.gmail.com>

Hi

now xpath and xpath_exists supports default namespace too

At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRCZ8oneG7g2vxs9ux71n8A9twwUO7zQpJiuz+7RGSpSuw@mail.
gmail.com>

1. Uniformity among simliar features

As mentioned in the proposal, but it is lack of uniformity that
the xpath transformer is applied only to xmltable and not for
other xpath related functions.

I have to fix the XPath function. The SQL/XML function Xmlexists doesn't
support namespaces/

Sorry, I forgot to care about that. (And the definition of
namespace array is of course fabricated by me). I'd like to leave
this to committers. Anyway it is working but the syntax (or
whether it is acceptable) is still arguable.

SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com&quot;&gt;
test</my:a>',
ARRAY[ARRAY['', 'http://example.com&#39;]]);
| xpath
| --------
| {test}
| (1 row)

The internal name is properly rejected, but the current internal
name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We
are preserving some short names and reject them as
user-defined. Doesn't just 'pgsqlxml' work?

Default namespace correctly become to be applied on bare
attribute names.

updated doc,
fixed all variants of expected result test file

Sorry for one by one comment but I found another misbehavior.

create table t1 (id int, doc xml);
insert into t1
values
(5, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a hoge="haha">50</a></row></
rows>');
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; AS x),
'/x:rows/x:row' passing t1.doc columns data int PATH
'child::x:a[1][attribute::hoge="haha"]') as x;
| data
| ------
| 50

but the following fails.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' passing t1.doc columns data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
| data
| ------
|
| (1 row)

Perhaps child::a is not prefixed by the transformation.

the problem was in unwanted attribute modification. The parser didn't
detect "attribute::hoge" as attribute. Updated parser does it. I reduce
duplicated code there more.

It worked as expected. But the comparison of "attribute" is
missing t1.length = 9 so the following expression wrongly passes.

child::a[1][attributeabcdefg::hoge="haha"

It is confusing that is_qual_name becomes true when t2 is not a
"qual name", and the way it treats a double-colon is hard to
understand.

It essentially does inserting the default namespace before
unqualified non-attribute name. I believe we can easily
look-ahead to detect a double colon and it would make things
simpler. Could you consider something like the attached patch?
(applies on top of ns-4 patch.)

XPath might be complex enough so that it's worth switching to
yacc/lex based transformer that is formally verifiable and won't
need a bunch of cryptic tests that finally cannot prove the
completeness. synchronous_standy_names is far simpler than XPath
but using yacc/lex parser.

Anyway the following is nitpicking of the current xpath_parser.c.

- NODENAME_FIRSTCHAR allows '-' as the first char but it is
excluded from NameStartChar (https://www.w3.org/TR/REC-
xml/#NT-NameStartChar)
I think characters with high-bit set is okay.
Also IS_NODENAME_CHAR should be changed.

fixed

- NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category
but have different naming schemes. Can these are named in the same way?

fixed

- The current transoformer seems to using up to one token stack
depth. Maybe the stack is needless. (pushed token is always
popped just after)

fixed

Thank you.

I found another (and should be the last, so sorry..) functional
defect in this. This doesn't add default namespace if the tag
name in a predicate is 'and' or 'or'. It needs to be fixed, or
wrote in the documentation as a restriction. (seem hard to fix
it..)

create table t1 (id int, doc xml);
insert into t1 values (1, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;val&gt;50&lt;/val&gt;&lt;and&gt;60&lt;/and&gt;&lt;/row&gt;&lt;/rows&gt;&#39;);
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; AS x),
'/x:rows/x:row' passing t1.doc columns data int PATH
'x:val[../x:and = 60]') as x;
data
------
50
(1 row)
select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' passing t1.doc columns data int PATH
'val[../and = 60]') as x;
data
------

(1 row)

Other comments are follows.

- Please add more comments. XPATH_TOKEN_NAME in _transformXPath
in my patch has more

- Debug output might be needed.

# sorry now time's up. will continue tomorrow.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

xml-xpath-default-ns-4-kh.patchtext/x-patch; charset=us-asciiDownload
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
index f22ec76..1d8d93c 100644
--- a/src/backend/utils/adt/xpath_parser.c
+++ b/src/backend/utils/adt/xpath_parser.c
@@ -41,6 +41,8 @@ typedef enum
 	XPATH_TOKEN_NAME,
 	XPATH_TOKEN_STRING,
 	XPATH_TOKEN_NUMBER,
+	XPATH_TOKEN_COLON,
+	XPATH_TOKEN_DCOLON,
 	XPATH_TOKEN_OTHER
 }	XPathTokenType;
 
@@ -68,6 +70,7 @@ typedef struct ParserData
 #define IS_NODENAME_CHAR(c)		(IS_NODENAME_FIRSTCHAR(c) || (c) == '-' || (c) == '.' || \
 								 ((c) >= '0' && (c) <= '9'))
 
+#define TOKEN_IS_EMPTY(t)	((t)->ttype == XPATH_TOKEN_NONE)
 
 /*
  * Returns next char after last char of token - XPath lexer
@@ -112,6 +115,17 @@ getXPathToken(char *str, XPathTokenInfo * ti)
 
 			ti->ttype = XPATH_TOKEN_STRING;
 		}
+		else if (c == ':')
+		{
+			/* look ahead to detect a doulbe-colon */
+			if (*str == ':')
+			{
+				ti->ttype = XPATH_TOKEN_DCOLON;
+				str++;
+			}
+			else
+				ti->ttype = XPATH_TOKEN_COLON;
+		}
 		else
 			ti->ttype = XPATH_TOKEN_OTHER;
 
@@ -165,6 +179,7 @@ pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
 
 	memcpy(&parser->buffer, ti, sizeof(XPathTokenInfo));
 	parser->buffer_is_empty = false;
+	ti->ttype = XPATH_TOKEN_NONE;
 }
 
 /*
@@ -179,6 +194,9 @@ writeXPathToken(StringInfo str, XPathTokenInfo * ti)
 		appendBinaryStringInfo(str, ti->start, ti->length);
 	else
 		appendStringInfoChar(str, *ti->start);
+
+	/* this token is comsumed */
+	ti->ttype = XPATH_TOKEN_NONE;
 }
 
 /*
@@ -192,7 +210,7 @@ _transformXPath(StringInfo str, XPathParserData * parser,
 {
 	XPathTokenInfo t1,
 				t2;
-	bool		token_is_tagname = false;
+	bool		tagname_needs_defnsp = false;
 	bool		token_is_tagattrib = false;
 
 	nextXPathToken(parser, &t1);
@@ -203,7 +221,11 @@ _transformXPath(StringInfo str, XPathParserData * parser,
 		{
 			case XPATH_TOKEN_NUMBER:
 			case XPATH_TOKEN_STRING:
-				token_is_tagname = false;
+				/*
+				 * string cannot be a tag name. write out it immediately and
+				 * go ahead
+				 */
+				tagname_needs_defnsp = false;
 				token_is_tagattrib = false;
 
 				writeXPathToken(str, &t1);
@@ -212,8 +234,6 @@ _transformXPath(StringInfo str, XPathParserData * parser,
 
 			case XPATH_TOKEN_NAME:
 				{
-					bool		is_qual_name = false;
-
 					/* inside predicate ignore keywords "and" "or" */
 					if (inside_predicate)
 					{
@@ -226,53 +246,56 @@ _transformXPath(StringInfo str, XPathParserData * parser,
 						}
 					}
 
-					token_is_tagname = true;
+					/* look ahead what is following the name token */
+					tagname_needs_defnsp = true;
 					nextXPathToken(parser, &t2);
-					if (t2.ttype == XPATH_TOKEN_OTHER)
+					if (t2.ttype == XPATH_TOKEN_COLON)
+					{
+						/* t1 is a quilified node name. no need to add default one. */
+						tagname_needs_defnsp = false;
+						writeXPathToken(str, &t1);	/* namespace name */
+						writeXPathToken(str, &t2);	/* colon */
+						/* get node name */
+						nextXPathToken(parser, &t1);
+					}
+					else if (t2.ttype == XPATH_TOKEN_DCOLON)
+					{
+						/* t1 is an axis name. write out as it is */
+						if (strncmp(t1.start, "attribute", 9) == 0 && t1.length == 9)
+							token_is_tagattrib = true;
+
+						writeXPathToken(str, &t1);	/* axis name */
+						writeXPathToken(str, &t2);	/* double colon */
+
+						/*
+						 * The next token may be qualified tag name, process
+						 * it as a fresh token.
+						 */
+						nextXPathToken(parser, &t1);
+						break;
+					}
+					else if (t2.ttype == XPATH_TOKEN_OTHER)
 					{
+						/* function name doesn't require namespace */
 						if (*t2.start == '(')
-							token_is_tagname = false;
-						else if (*t2.start == ':')
-						{
-							XPathTokenInfo t3;
-
-							nextXPathToken(parser, &t3);
-							if (t3.ttype == XPATH_TOKEN_OTHER && *t3.start == ':'
-									 && strncmp(t1.start, "attribute", 9) == 0)
-							{
-								/* other syntax for attribute, where we should not apply def namespace */
-								appendStringInfo(str, "attribute::");
-								nextXPathToken(parser, &t1);
-								token_is_tagattrib = true;
-								break;
-							}
-							else
-							{
-								pushXPathToken(parser, &t3);
-								is_qual_name = true;
-							}
-						}
+							tagname_needs_defnsp = false;
+						else
+							pushXPathToken(parser, &t2);
 					}
 
-					if (token_is_tagname && !token_is_tagattrib
-								 && !is_qual_name && def_namespace_name != NULL)
+					if (tagname_needs_defnsp && !token_is_tagattrib &&
+						def_namespace_name != NULL)
 						appendStringInfo(str, "%s:", def_namespace_name);
 
 					token_is_tagattrib = false;
 
-					writeXPathToken(str, &t1);
+					/* write maybe-tagname if not consumed */
+					if (!TOKEN_IS_EMPTY(&t1))
+						writeXPathToken(str, &t1);
 
-					if (is_qual_name)
-					{
+					/* output t2 if not consumed yet */
+					if (!TOKEN_IS_EMPTY(&t2))
 						writeXPathToken(str, &t2);
-						nextXPathToken(parser, &t1);
-						if (t1.ttype == XPATH_TOKEN_NAME)
-							writeXPathToken(str, &t1);
-						else
-							pushXPathToken(parser, &t1);
-					}
-					else
-						pushXPathToken(parser, &t2);
 
 					nextXPathToken(parser, &t1);
 				}
@@ -283,7 +306,6 @@ _transformXPath(StringInfo str, XPathParserData * parser,
 					char		c = *t1.start;
 
 					token_is_tagattrib = false;
-					token_is_tagname = false;
 
 					writeXPathToken(str, &t1);
 
@@ -307,10 +329,14 @@ _transformXPath(StringInfo str, XPathParserData * parser,
 				}
 				break;
 
+			case XPATH_TOKEN_COLON:
+			case XPATH_TOKEN_DCOLON:
 			case XPATH_TOKEN_NONE:
 				elog(ERROR, "should not be here");
 		}
 	}
+
+	elog(LOG, "\"%s\"", str->data);
 }
 
 void
#10Pavel Stehule
pavel.stehule@gmail.com
In reply to: Kyotaro HORIGUCHI (#9)
1 attachment(s)
Re: proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi

2017-11-06 14:00 GMT+01:00 Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp>:

Thank you for the new patch.

- The latest patch is missing xpath_parser.h at least since
ns-3. That of the first (not-numbered) version was still
usable.

- c29c578 conflicts on doc/src/sgml/func.sgml

At Sun, 15 Oct 2017 12:06:11 +0200, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRCYBH+a6oJoEYUFDUpBQ1ySwtt2CfnFZxs2A
b9EfONbUQ@mail.gmail.com>

2017-10-02 12:22 GMT+02:00 Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp>:

Hi, thanks for the new patch.

# The patch is missing xpath_parser.h. That of the first patch was

usable.

At Thu, 28 Sep 2017 07:59:41 +0200, Pavel Stehule <

pavel.stehule@gmail.com>

wrote in <CAFj8pRBMQa07a=+qQAVMtz5M_hqkJBhiQSOP76+-BrFDj37pvg@
mail.gmail.com>

Hi

now xpath and xpath_exists supports default namespace too

At Wed, 27 Sep 2017 22:41:52 +0200, Pavel Stehule <

pavel.stehule@gmail.com>

wrote in <CAFj8pRCZ8oneG7g2vxs9ux71n8A9twwUO7zQpJiuz+7RGSpSuw@mail.
gmail.com>

1. Uniformity among simliar features

As mentioned in the proposal, but it is lack of uniformity that
the xpath transformer is applied only to xmltable and not for
other xpath related functions.

I have to fix the XPath function. The SQL/XML function Xmlexists

doesn't

support namespaces/

Sorry, I forgot to care about that. (And the definition of
namespace array is of course fabricated by me). I'd like to leave
this to committers. Anyway it is working but the syntax (or
whether it is acceptable) is still arguable.

SELECT xpath('/a/text()', '<my:a xmlns:my="http://example.com&quot;&gt;
test</my:a>',
ARRAY[ARRAY['', 'http://example.com&#39;]]);
| xpath
| --------
| {test}
| (1 row)

The internal name is properly rejected, but the current internal
name (pgdefnamespace.pgsqlxml.internal) seems a bit too long. We
are preserving some short names and reject them as
user-defined. Doesn't just 'pgsqlxml' work?

Default namespace correctly become to be applied on bare
attribute names.

updated doc,
fixed all variants of expected result test file

Sorry for one by one comment but I found another misbehavior.

create table t1 (id int, doc xml);
insert into t1
values
(5, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;a hoge="haha">50</a></row></
rows>');
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; AS x),
'/x:rows/x:row' passing t1.doc columns data int PATH
'child::x:a[1][attribute::hoge="haha"]') as x;
| data
| ------
| 50

but the following fails.

select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' passing t1.doc columns data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
| data
| ------
|
| (1 row)

Perhaps child::a is not prefixed by the transformation.

the problem was in unwanted attribute modification. The parser didn't
detect "attribute::hoge" as attribute. Updated parser does it. I reduce
duplicated code there more.

It worked as expected. But the comparison of "attribute" is
missing t1.length = 9 so the following expression wrongly passes.

child::a[1][attributeabcdefg::hoge="haha"

It is confusing that is_qual_name becomes true when t2 is not a
"qual name", and the way it treats a double-colon is hard to
understand.

It essentially does inserting the default namespace before
unqualified non-attribute name. I believe we can easily
look-ahead to detect a double colon and it would make things
simpler. Could you consider something like the attached patch?
(applies on top of ns-4 patch.)

XPath might be complex enough so that it's worth switching to
yacc/lex based transformer that is formally verifiable and won't
need a bunch of cryptic tests that finally cannot prove the
completeness. synchronous_standy_names is far simpler than XPath
but using yacc/lex parser.

Anyway the following is nitpicking of the current xpath_parser.c.

- NODENAME_FIRSTCHAR allows '-' as the first char but it is
excluded from NameStartChar (https://www.w3.org/TR/REC-
xml/#NT-NameStartChar)
I think characters with high-bit set is okay.
Also IS_NODENAME_CHAR should be changed.

fixed

- NODENAME_FIRSTCHAR and IS_NODENAME_CHAR is in the same category
but have different naming schemes. Can these are named in the same

way?

fixed

- The current transoformer seems to using up to one token stack
depth. Maybe the stack is needless. (pushed token is always
popped just after)

fixed

Thank you.

I found another (and should be the last, so sorry..) functional
defect in this. This doesn't add default namespace if the tag
name in a predicate is 'and' or 'or'. It needs to be fixed, or
wrote in the documentation as a restriction. (seem hard to fix
it..)

create table t1 (id int, doc xml);
insert into t1 values (1, '<rows xmlns="http://x.y&quot;&gt;&lt;row&gt;&lt;val&gt;
50</val><and>60</and></row></rows>');
select x.* from t1, xmltable(XMLNAMESPACES('http://x.y&#39; AS x),
'/x:rows/x:row' passing t1.doc columns data int PATH
'x:val[../x:and = 60]') as x;
data
------
50
(1 row)
select x.* from t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y&#39;),
'/rows/row' passing t1.doc columns data int PATH
'val[../and = 60]') as x;
data
------

(1 row)

yes - this check needs context parser. I am expecting, this case is corner
case, not too much usual, so doc based solution is enough.

Other comments are follows.

- Please add more comments. XPATH_TOKEN_NAME in _transformXPath
in my patch has more

- Debug output might be needed.

# sorry now time's up. will continue tomorrow.

I fixed I hope almost all issues - your patch is merged with some changes.
The most significant change is a reaction to broken XPath expression. I
prefer do nothing - libxml2 raise a error.

Attached new version.

Thank you for tips, ideas, code :)

Show quoted text

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

xml-xpath-default-ns-5.patchtext/x-patch; charset=US-ASCII; name=xml-xpath-default-ns-5.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index f901567f7e..76424efa31 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10468,7 +10468,8 @@ SELECT xml_is_well_formed_document('<pg:foo xmlns:pg="http://postgresql.org/stuf
      second the namespace URI. It is not required that aliases provided in
      this array be the same as those being used in the XML document itself (in
      other words, both in the XML document and in the <function>xpath</function>
-     function context, aliases are <emphasis>local</emphasis>).
+     function context, aliases are <emphasis>local</emphasis>). Default namespace has
+     empty name (empty string) and should be only one.
     </para>
 
     <para>
@@ -10484,11 +10485,20 @@ SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>',
 ]]></screen>
     </para>
 
+    <para>
+     Inside predicate literals <literal>and</literal>, <literal>or</literal>,
+     <literal>div</literal> and <literal>mod</literal> are used as keywords
+     (XPath operators) every time and default namespace are not applied there.
+     If you would to use these literals like tag names, then the default namespace
+     should not be used, and these literals should be explicitly
+     labeled.
+    </para>
+
     <para>
      To deal with default (anonymous) namespaces, do something like this:
 <screen><![CDATA[
-SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
-             ARRAY[ARRAY['mydefns', 'http://example.com']]);
+SELECT xpath('//b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
+             ARRAY[ARRAY['', 'http://example.com']]);
 
  xpath
 --------
@@ -10562,8 +10572,7 @@ SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</m
     <para>
      The optional <literal>XMLNAMESPACES</literal> clause is a comma-separated
      list of namespaces.  It specifies the XML namespaces used in
-     the document and their aliases. A default namespace specification
-     is not currently supported.
+     the document and their aliases.
     </para>
 
     <para>
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 1fb018416e..b60a3cfe0d 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
 	tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
 	tsvector.o tsvector_op.o tsvector_parser.o \
 	txid.o uuid.o varbit.o varchar.o varlena.o version.o \
-	windowfuncs.o xid.o xml.o
+	windowfuncs.o xid.o xml.o xpath_parser.o
 
 like.o: like.c like_match.c
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index c9d07f2ae9..8c7f37df4c 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -90,7 +90,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/xml.h"
-
+#include "utils/xpath_parser.h"
 
 /* GUC variables */
 int			xmlbinary;
@@ -187,6 +187,7 @@ typedef struct XmlTableBuilderData
 	xmlXPathCompExprPtr xpathcomp;
 	xmlXPathObjectPtr xpathobj;
 	xmlXPathCompExprPtr *xpathscomp;
+	bool		with_default_ns;
 } XmlTableBuilderData;
 #endif
 
@@ -227,6 +228,7 @@ const TableFuncRoutine XmlTableRoutine =
 #define NAMESPACE_XSI "http://www.w3.org/2001/XMLSchema-instance"
 #define NAMESPACE_SQLXML "http://standards.iso.org/iso/9075/2003/sqlxml"
 
+#define DEFAULT_NAMESPACE_NAME		"pgdefnamespace.pgsqlxml.internal"
 
 #ifdef USE_LIBXML
 
@@ -3849,6 +3851,7 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 	int			ndim;
 	Datum	   *ns_names_uris;
 	bool	   *ns_names_uris_nulls;
+	bool		with_default_ns = false;
 	int			ns_count;
 
 	/*
@@ -3898,7 +3901,6 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 				 errmsg("empty XPath expression")));
 
 	string = pg_xmlCharStrndup(datastr, len);
-	xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -3941,6 +3943,26 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 							(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
 							 errmsg("neither namespace name nor URI may be null")));
 				ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+
+				/* Don't allow same namespace as out internal default namespace name */
+				if (strcmp(ns_name, DEFAULT_NAMESPACE_NAME) == 0)
+					ereport(ERROR,
+								(errcode(ERRCODE_RESERVED_NAME),
+								 errmsg("cannot to use \"%s\" as namespace name",
+										  DEFAULT_NAMESPACE_NAME),
+								 errdetail("\"%s\" is reserved for internal purpose",
+										  DEFAULT_NAMESPACE_NAME)));
+				if (*ns_name == '\0')
+				{
+					if (with_default_ns)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("only one default namespace is allowed")));
+
+					with_default_ns = true;
+					ns_name = DEFAULT_NAMESPACE_NAME;
+				}
+
 				ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
 				if (xmlXPathRegisterNs(xpathctx,
 									   (xmlChar *) ns_name,
@@ -3951,6 +3973,16 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 			}
 		}
 
+		if (with_default_ns)
+		{
+			StringInfoData		str;
+
+			transformXPath(&str, text_to_cstring(xpath_expr_text), DEFAULT_NAMESPACE_NAME);
+			xpath_expr = pg_xmlCharStrndup(str.data, str.len);
+		}
+		else
+			xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
+
 		xpathcomp = xmlXPathCompile(xpath_expr);
 		if (xpathcomp == NULL || xmlerrcxt->err_occurred)
 			xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
@@ -4195,6 +4227,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts)
 	xtCxt->magic = XMLTABLE_CONTEXT_MAGIC;
 	xtCxt->natts = natts;
 	xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts);
+	xtCxt->with_default_ns = false;
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -4287,6 +4320,7 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value)
 #endif							/* not USE_LIBXML */
 }
 
+
 /*
  * XmlTableSetNamespace
  *		Add a namespace declaration
@@ -4297,12 +4331,25 @@ XmlTableSetNamespace(TableFuncScanState *state, char *name, char *uri)
 #ifdef USE_LIBXML
 	XmlTableBuilderData *xtCxt;
 
-	if (name == NULL)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DEFAULT namespace is not supported")));
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace");
 
+	if (name != NULL)
+	{
+		/* Don't allow same namespace as out internal default namespace name */
+		if (strcmp(name, DEFAULT_NAMESPACE_NAME) == 0)
+			ereport(ERROR,
+						(errcode(ERRCODE_RESERVED_NAME),
+						 errmsg("cannot to use \"%s\" as namespace name",
+								  DEFAULT_NAMESPACE_NAME),
+						 errdetail("\"%s\" is reserved for internal purpose",
+								  DEFAULT_NAMESPACE_NAME)));
+	}
+	else
+	{
+		xtCxt->with_default_ns = true;
+		name = DEFAULT_NAMESPACE_NAME;
+	}
+
 	if (xmlXPathRegisterNs(xtCxt->xpathcxt,
 						   pg_xmlCharStrndup(name, strlen(name)),
 						   pg_xmlCharStrndup(uri, strlen(uri))))
@@ -4331,6 +4378,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, char *path)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("row path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathcomp = xmlXPathCompile(xstr);
@@ -4362,6 +4417,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, char *path, int colnum)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("column path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr);
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
new file mode 100644
index 0000000000..35441a646c
--- /dev/null
+++ b/src/backend/utils/adt/xpath_parser.c
@@ -0,0 +1,361 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.c
+ *	  XML XPath parser.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/backend/utils/adt/xpath_parser.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "utils/xpath_parser.h"
+
+/*
+ * All PostgreSQL XML related functionality is based on libxml2 library, and
+ * XPath support is not an exception.  However, libxml2 doesn't support
+ * default namespace for XPath expressions. Because there are not any API
+ * how to transform or access to parsed XPath expression we have to parse
+ * XPath here.
+ *
+ * Those functionalities are implemented with a simple XPath parser/
+ * preprocessor.  This XPath parser transforms a XPath expression to another
+ * XPath expression that can be used by libxml2 XPath evaluation. It doesn't
+ * replace libxml2 XPath parser or libxml2 XPath expression evaluation.
+ */
+
+#ifdef USE_LIBXML
+
+/*
+ * We need to work with XPath expression tokens.  When expression starting with
+ * nodename, then we can use prefix.  When default namespace is defined, then we
+ * should to enhance any nodename and attribute without namespace by default
+ * namespace.
+ */
+
+typedef enum
+{
+	XPATH_TOKEN_NONE,
+	XPATH_TOKEN_NAME,
+	XPATH_TOKEN_STRING,
+	XPATH_TOKEN_NUMBER,
+	XPATH_TOKEN_COLON,
+	XPATH_TOKEN_DCOLON,
+	XPATH_TOKEN_OTHER
+}	XPathTokenType;
+
+typedef struct XPathTokenInfo
+{
+	XPathTokenType ttype;
+	char	   *start;
+	int			length;
+}	XPathTokenInfo;
+
+typedef struct ParserData
+{
+	char	   *str;
+	char	   *cur;
+	XPathTokenInfo buffer;
+	bool		buffer_is_empty;
+}	XPathParserData;
+
+/* Any high-bit-set character is OK (might be part of a multibyte char) */
+#define IS_NODENAME_FIRSTCHAR(c)	 ((c) == '_' || \
+								 ((c) >= 'A' && (c) <= 'Z') || \
+								 ((c) >= 'a' && (c) <= 'z') || \
+								 (IS_HIGHBIT_SET(c)))
+
+#define IS_NODENAME_CHAR(c)		(IS_NODENAME_FIRSTCHAR(c) || (c) == '-' || (c) == '.' || \
+								 ((c) >= '0' && (c) <= '9'))
+
+#define TOKEN_IS_EMPTY(t)		((t).ttype == XPATH_TOKEN_NONE)
+
+/*
+ * Returns next char after last char of token - XPath lexer
+ */
+static char *
+getXPathToken(char *str, XPathTokenInfo * ti)
+{
+	/* skip initial spaces */
+	while (*str == ' ')
+		str++;
+
+	if (*str != '\0')
+	{
+		char		c = *str;
+
+		ti->start = str++;
+
+		if (c >= '0' && c <= '9')
+		{
+			while (*str >= '0' && *str <= '9')
+				str++;
+			if (*str == '.')
+			{
+				str++;
+				while (*str >= '0' && *str <= '9')
+					str++;
+			}
+			ti->ttype = XPATH_TOKEN_NUMBER;
+		}
+		else if (IS_NODENAME_FIRSTCHAR(c))
+		{
+			while (IS_NODENAME_CHAR(*str))
+				str++;
+
+			ti->ttype = XPATH_TOKEN_NAME;
+		}
+		else if (c == '"')
+		{
+			while (*str != '\0')
+				if (*str++ == '"')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else if (c == ':')
+		{
+			/* look ahead to detect a double-colon */
+			if (*str == ':')
+			{
+				ti->ttype = XPATH_TOKEN_DCOLON;
+				str++;
+			}
+			else
+				ti->ttype = XPATH_TOKEN_COLON;
+		}
+		else
+			ti->ttype = XPATH_TOKEN_OTHER;
+
+		ti->length = str - ti->start;
+	}
+	else
+	{
+		ti->start = NULL;
+		ti->length = 0;
+
+		ti->ttype = XPATH_TOKEN_NONE;
+	}
+
+	return str;
+}
+
+/*
+ * reset XPath parser stack
+ */
+static void
+initXPathParser(XPathParserData * parser, char *str)
+{
+	parser->str = str;
+	parser->cur = str;
+	parser->buffer_is_empty = true;
+}
+
+/*
+ * Returns token from stack or read token
+ */
+static void
+nextXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+	{
+		memcpy(ti, &parser->buffer, sizeof(XPathTokenInfo));
+		parser->buffer_is_empty = true;
+	}
+	else
+		parser->cur = getXPathToken(parser->cur, ti);
+}
+
+/*
+ * Push token to stack
+ */
+static void
+pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+		elog(ERROR, "internal error");
+
+	memcpy(&parser->buffer, ti, sizeof(XPathTokenInfo));
+	parser->buffer_is_empty = false;
+	ti->ttype = XPATH_TOKEN_NONE;
+}
+
+/*
+ * Write token to output string
+ */
+static void
+writeXPathToken(StringInfo str, XPathTokenInfo * ti)
+{
+	Assert(ti->ttype != XPATH_TOKEN_NONE);
+
+	if (ti->ttype != XPATH_TOKEN_OTHER)
+		appendBinaryStringInfo(str, ti->start, ti->length);
+	else
+		appendStringInfoChar(str, *ti->start);
+
+	ti->ttype = XPATH_TOKEN_NONE;
+}
+
+/*
+ * This is main part of XPath transformation. It can be called recursivly,
+ * when XPath expression contains predicates.
+ */
+static void
+_transformXPath(StringInfo str, XPathParserData * parser,
+				bool inside_predicate,
+				char *def_namespace_name)
+{
+	XPathTokenInfo t1,
+				t2;
+	bool		tagname_needs_defnsp;
+	bool		token_is_tagattrib = false;
+
+	nextXPathToken(parser, &t1);
+
+	while (t1.ttype != XPATH_TOKEN_NONE)
+	{
+		switch (t1.ttype)
+		{
+			case XPATH_TOKEN_NUMBER:
+			case XPATH_TOKEN_STRING:
+			case XPATH_TOKEN_COLON:
+			case XPATH_TOKEN_DCOLON:
+				/* write without any changes */
+				writeXPathToken(str, &t1);
+				/* process fresh token */
+				nextXPathToken(parser, &t1);
+				break;
+
+			case XPATH_TOKEN_NAME:
+				{
+					/*
+					 * Inside predicate ignore keywords (literal operators)
+					 * "and" "or" "div" and "mod".
+					 */
+					if (inside_predicate)
+					{
+						if ((strncmp(t1.start, "and", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "or", 2) == 0 && t1.length == 2) ||
+						 (strncmp(t1.start, "div", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "mod", 3) == 0 && t1.length == 3))
+						{
+							token_is_tagattrib = false;
+
+							/* keyword */
+							writeXPathToken(str, &t1);
+							/* process fresh token */
+							nextXPathToken(parser, &t1);
+							break;
+						}
+					}
+
+					tagname_needs_defnsp = true;
+
+					nextXPathToken(parser, &t2);
+					if (t2.ttype == XPATH_TOKEN_COLON)
+					{
+						/* t1 is a quilified node name. no need to add default one. */
+						tagname_needs_defnsp = false;
+
+						/* namespace name */
+						writeXPathToken(str, &t1);
+						/* colon */
+						writeXPathToken(str, &t2);
+						/* get node name */
+						nextXPathToken(parser, &t1);
+					}
+					else if (t2.ttype == XPATH_TOKEN_DCOLON)
+					{
+						/* t1 is an axis name. write out as it is */
+						if (strncmp(t1.start, "attribute", 9) == 0 && t1.length == 9)
+							token_is_tagattrib = true;
+
+						/* axis name */
+						writeXPathToken(str, &t1);
+						/* double colon */
+						writeXPathToken(str, &t2);
+
+						/*
+						 * The next token may be qualified tag name, process
+						 * it as a fresh token.
+						 */
+						nextXPathToken(parser, &t1);
+						break;
+					}
+					else if (t2.ttype == XPATH_TOKEN_OTHER)
+					{
+						/* function name doesn't require namespace */
+						if (*t2.start == '(')
+							tagname_needs_defnsp = false;
+						else
+							pushXPathToken(parser, &t2);
+					}
+
+					if (tagname_needs_defnsp && !token_is_tagattrib)
+						appendStringInfo(str, "%s:", def_namespace_name);
+
+					token_is_tagattrib = false;
+
+					/* write maybe-tagname if not consumed yet */
+					if (!TOKEN_IS_EMPTY(t1))
+						writeXPathToken(str, &t1);
+
+					/* output t2 if not consumed yet */
+					if (!TOKEN_IS_EMPTY(t2))
+						writeXPathToken(str, &t2);
+
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_OTHER:
+				{
+					char		c = *t1.start;
+
+					writeXPathToken(str, &t1);
+
+					if (c == '[')
+						_transformXPath(str, parser, true, def_namespace_name);
+					else
+					{
+						if (c == ']' && inside_predicate)
+						{
+							return;
+						}
+						else if (c == '@')
+						{
+							nextXPathToken(parser, &t1);
+							if (t1.ttype == XPATH_TOKEN_NAME)
+								token_is_tagattrib = true;
+
+							pushXPathToken(parser, &t1);
+						}
+					}
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_NONE:
+				elog(ERROR, "should not be here");
+		}
+	}
+}
+
+void
+transformXPath(StringInfo str, char *xpath,
+			   char *def_namespace_name)
+{
+	XPathParserData parser;
+
+	Assert(def_namespace_name != NULL);
+
+	initStringInfo(str);
+	initXPathParser(&parser, xpath);
+	_transformXPath(str, &parser, false, def_namespace_name);
+
+	elog(DEBUG1, "apply default namespace \"%s\"", str->data);
+}
+
+#endif
diff --git a/src/include/utils/xpath_parser.h b/src/include/utils/xpath_parser.h
new file mode 100644
index 0000000000..b2fc239e12
--- /dev/null
+++ b/src/include/utils/xpath_parser.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.h
+ *	  Declarations for XML XPath transformation.
+ *
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/xml.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef XPATH_PARSER_H
+#define XPATH_PARSER_H
+
+#include "postgres.h"
+#include "lib/stringinfo.h"
+
+void transformXPath(StringInfo str, char *xpath, char *def_namespace_name);
+
+#endif   /* XPATH_PARSER_H */
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index bcc585d427..63e04f1353 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1085,7 +1085,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1452,3 +1456,56 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index d3bd8c91d7..58f9151788 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -1302,3 +1302,59 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
 ---
 (0 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+ERROR:  unsupported XML feature
+LINE 1: INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a ...
+                                  ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="ht...
+                                                    ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x...
+                                              ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: ...ELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xml...
+                                                             ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index ff77132803..c92a09e5a9 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -1065,7 +1065,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1432,3 +1436,56 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index eb4687fb09..e8cff5f22d 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -558,3 +558,23 @@ INSERT INTO xmltest2 VALUES('<d><r><dc>2</dc></r></d>', 'D');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
+
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
#11Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Pavel Stehule (#10)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

On Thu, Nov 9, 2017 at 10:11 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

Attached new version.

Hi Pavel,

FYI my patch testing robot says[1]https://travis-ci.org/postgresql-cfbot/postgresql/builds/305979133:

xml ... FAILED

regression.diffs says:

+ SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
'/rows/row' PASSING t1.doc COLUMNS data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
+ data
+ ------
+ (0 rows)
+

Maybe you forgot to git-add the expected file?

[1]: https://travis-ci.org/postgresql-cfbot/postgresql/builds/305979133

--
Thomas Munro
http://www.enterprisedb.com

#12Pavel Stehule
pavel.stehule@gmail.com
In reply to: Thomas Munro (#11)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi

2017-11-22 22:49 GMT+01:00 Thomas Munro <thomas.munro@enterprisedb.com>:

On Thu, Nov 9, 2017 at 10:11 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Attached new version.

Hi Pavel,

FYI my patch testing robot says[1]:

xml ... FAILED

regression.diffs says:

+ SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
'/rows/row' PASSING t1.doc COLUMNS data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
+ data
+ ------
+ (0 rows)
+

Maybe you forgot to git-add the expected file?

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/305979133

unfortunately xml.out has 3 versions and is possible so one version should
be taken elsewhere than my comp.

please can me send your result xml.out file?

Regards

Pavel

Show quoted text

--
Thomas Munro
http://www.enterprisedb.com

#13Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#12)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2017-11-24 17:53 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

2017-11-22 22:49 GMT+01:00 Thomas Munro <thomas.munro@enterprisedb.com>:

On Thu, Nov 9, 2017 at 10:11 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Attached new version.

Hi Pavel,

FYI my patch testing robot says[1]:

xml ... FAILED

regression.diffs says:

+ SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
'/rows/row' PASSING t1.doc COLUMNS data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
+ data
+ ------
+ (0 rows)
+

Maybe you forgot to git-add the expected file?

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/305979133

unfortunately xml.out has 3 versions and is possible so one version should
be taken elsewhere than my comp.

please can me send your result xml.out file?

looks like this case is without xml support so I can fix on my comp.

Show quoted text

Regards

Pavel

--
Thomas Munro
http://www.enterprisedb.com

#14Pavel Stehule
pavel.stehule@gmail.com
In reply to: Pavel Stehule (#13)
1 attachment(s)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

2017-11-24 18:13 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-11-24 17:53 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

2017-11-22 22:49 GMT+01:00 Thomas Munro <thomas.munro@enterprisedb.com>:

On Thu, Nov 9, 2017 at 10:11 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Attached new version.

Hi Pavel,

FYI my patch testing robot says[1]:

xml ... FAILED

regression.diffs says:

+ SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
'/rows/row' PASSING t1.doc COLUMNS data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
+ data
+ ------
+ (0 rows)
+

Maybe you forgot to git-add the expected file?

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/305979133

unfortunately xml.out has 3 versions and is possible so one version
should be taken elsewhere than my comp.

please can me send your result xml.out file?

looks like this case is without xml support so I can fix on my comp.

fixed regress test

Show quoted text

Regards

Pavel

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

xml-xpath-default-ns-6.patchtext/x-patch; charset=US-ASCII; name=xml-xpath-default-ns-6.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4dd9d029e6..b871e82a73 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10489,7 +10489,8 @@ SELECT xml_is_well_formed_document('<pg:foo xmlns:pg="http://postgresql.org/stuf
      second the namespace URI. It is not required that aliases provided in
      this array be the same as those being used in the XML document itself (in
      other words, both in the XML document and in the <function>xpath</function>
-     function context, aliases are <emphasis>local</emphasis>).
+     function context, aliases are <emphasis>local</emphasis>). Default namespace has
+     empty name (empty string) and should be only one.
     </para>
 
     <para>
@@ -10505,11 +10506,20 @@ SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>',
 ]]></screen>
     </para>
 
+    <para>
+     Inside predicate literals <literal>and</literal>, <literal>or</literal>,
+     <literal>div</literal> and <literal>mod</literal> are used as keywords
+     (XPath operators) every time and default namespace are not applied there.
+     If you would to use these literals like tag names, then the default namespace
+     should not be used, and these literals should be explicitly
+     labeled.
+    </para>
+
     <para>
      To deal with default (anonymous) namespaces, do something like this:
 <screen><![CDATA[
-SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
-             ARRAY[ARRAY['mydefns', 'http://example.com']]);
+SELECT xpath('//b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
+             ARRAY[ARRAY['', 'http://example.com']]);
 
  xpath
 --------
@@ -10583,8 +10593,7 @@ SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</m
     <para>
      The optional <literal>XMLNAMESPACES</literal> clause is a comma-separated
      list of namespaces.  It specifies the XML namespaces used in
-     the document and their aliases. A default namespace specification
-     is not currently supported.
+     the document and their aliases.
     </para>
 
     <para>
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 1fb018416e..b60a3cfe0d 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
 	tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
 	tsvector.o tsvector_op.o tsvector_parser.o \
 	txid.o uuid.o varbit.o varchar.o varlena.o version.o \
-	windowfuncs.o xid.o xml.o
+	windowfuncs.o xid.o xml.o xpath_parser.o
 
 like.o: like.c like_match.c
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index fa392cd0e5..90c239c8b7 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -90,7 +90,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/xml.h"
-
+#include "utils/xpath_parser.h"
 
 /* GUC variables */
 int			xmlbinary;
@@ -187,6 +187,7 @@ typedef struct XmlTableBuilderData
 	xmlXPathCompExprPtr xpathcomp;
 	xmlXPathObjectPtr xpathobj;
 	xmlXPathCompExprPtr *xpathscomp;
+	bool		with_default_ns;
 } XmlTableBuilderData;
 #endif
 
@@ -227,6 +228,7 @@ const TableFuncRoutine XmlTableRoutine =
 #define NAMESPACE_XSI "http://www.w3.org/2001/XMLSchema-instance"
 #define NAMESPACE_SQLXML "http://standards.iso.org/iso/9075/2003/sqlxml"
 
+#define DEFAULT_NAMESPACE_NAME		"pgdefnamespace.pgsqlxml.internal"
 
 #ifdef USE_LIBXML
 
@@ -3850,6 +3852,7 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 	int			ndim;
 	Datum	   *ns_names_uris;
 	bool	   *ns_names_uris_nulls;
+	bool		with_default_ns = false;
 	int			ns_count;
 
 	/*
@@ -3899,7 +3902,6 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 				 errmsg("empty XPath expression")));
 
 	string = pg_xmlCharStrndup(datastr, len);
-	xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
 
 	/*
 	 * In a UTF8 database, skip any xml declaration, which might assert
@@ -3953,6 +3955,26 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 							(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
 							 errmsg("neither namespace name nor URI may be null")));
 				ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+
+				/* Don't allow same namespace as out internal default namespace name */
+				if (strcmp(ns_name, DEFAULT_NAMESPACE_NAME) == 0)
+					ereport(ERROR,
+								(errcode(ERRCODE_RESERVED_NAME),
+								 errmsg("cannot to use \"%s\" as namespace name",
+										  DEFAULT_NAMESPACE_NAME),
+								 errdetail("\"%s\" is reserved for internal purpose",
+										  DEFAULT_NAMESPACE_NAME)));
+				if (*ns_name == '\0')
+				{
+					if (with_default_ns)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("only one default namespace is allowed")));
+
+					with_default_ns = true;
+					ns_name = DEFAULT_NAMESPACE_NAME;
+				}
+
 				ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
 				if (xmlXPathRegisterNs(xpathctx,
 									   (xmlChar *) ns_name,
@@ -3963,6 +3985,16 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 			}
 		}
 
+		if (with_default_ns)
+		{
+			StringInfoData		str;
+
+			transformXPath(&str, text_to_cstring(xpath_expr_text), DEFAULT_NAMESPACE_NAME);
+			xpath_expr = pg_xmlCharStrndup(str.data, str.len);
+		}
+		else
+			xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
+
 		xpathcomp = xmlXPathCompile(xpath_expr);
 		if (xpathcomp == NULL || xmlerrcxt->err_occurred)
 			xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
@@ -4207,6 +4239,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts)
 	xtCxt->magic = XMLTABLE_CONTEXT_MAGIC;
 	xtCxt->natts = natts;
 	xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts);
+	xtCxt->with_default_ns = false;
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -4299,6 +4332,7 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value)
 #endif							/* not USE_LIBXML */
 }
 
+
 /*
  * XmlTableSetNamespace
  *		Add a namespace declaration
@@ -4309,12 +4343,25 @@ XmlTableSetNamespace(TableFuncScanState *state, const char *name, const char *ur
 #ifdef USE_LIBXML
 	XmlTableBuilderData *xtCxt;
 
-	if (name == NULL)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DEFAULT namespace is not supported")));
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace");
 
+	if (name != NULL)
+	{
+		/* Don't allow same namespace as out internal default namespace name */
+		if (strcmp(name, DEFAULT_NAMESPACE_NAME) == 0)
+			ereport(ERROR,
+						(errcode(ERRCODE_RESERVED_NAME),
+						 errmsg("cannot to use \"%s\" as namespace name",
+								  DEFAULT_NAMESPACE_NAME),
+						 errdetail("\"%s\" is reserved for internal purpose",
+								  DEFAULT_NAMESPACE_NAME)));
+	}
+	else
+	{
+		xtCxt->with_default_ns = true;
+		name = DEFAULT_NAMESPACE_NAME;
+	}
+
 	if (xmlXPathRegisterNs(xtCxt->xpathcxt,
 						   pg_xmlCharStrndup(name, strlen(name)),
 						   pg_xmlCharStrndup(uri, strlen(uri))))
@@ -4343,6 +4390,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, const char *path)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("row path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathcomp = xmlXPathCompile(xstr);
@@ -4374,6 +4429,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, const char *path, int colnum)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("column path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr);
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
new file mode 100644
index 0000000000..35441a646c
--- /dev/null
+++ b/src/backend/utils/adt/xpath_parser.c
@@ -0,0 +1,361 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.c
+ *	  XML XPath parser.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/backend/utils/adt/xpath_parser.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "utils/xpath_parser.h"
+
+/*
+ * All PostgreSQL XML related functionality is based on libxml2 library, and
+ * XPath support is not an exception.  However, libxml2 doesn't support
+ * default namespace for XPath expressions. Because there are not any API
+ * how to transform or access to parsed XPath expression we have to parse
+ * XPath here.
+ *
+ * Those functionalities are implemented with a simple XPath parser/
+ * preprocessor.  This XPath parser transforms a XPath expression to another
+ * XPath expression that can be used by libxml2 XPath evaluation. It doesn't
+ * replace libxml2 XPath parser or libxml2 XPath expression evaluation.
+ */
+
+#ifdef USE_LIBXML
+
+/*
+ * We need to work with XPath expression tokens.  When expression starting with
+ * nodename, then we can use prefix.  When default namespace is defined, then we
+ * should to enhance any nodename and attribute without namespace by default
+ * namespace.
+ */
+
+typedef enum
+{
+	XPATH_TOKEN_NONE,
+	XPATH_TOKEN_NAME,
+	XPATH_TOKEN_STRING,
+	XPATH_TOKEN_NUMBER,
+	XPATH_TOKEN_COLON,
+	XPATH_TOKEN_DCOLON,
+	XPATH_TOKEN_OTHER
+}	XPathTokenType;
+
+typedef struct XPathTokenInfo
+{
+	XPathTokenType ttype;
+	char	   *start;
+	int			length;
+}	XPathTokenInfo;
+
+typedef struct ParserData
+{
+	char	   *str;
+	char	   *cur;
+	XPathTokenInfo buffer;
+	bool		buffer_is_empty;
+}	XPathParserData;
+
+/* Any high-bit-set character is OK (might be part of a multibyte char) */
+#define IS_NODENAME_FIRSTCHAR(c)	 ((c) == '_' || \
+								 ((c) >= 'A' && (c) <= 'Z') || \
+								 ((c) >= 'a' && (c) <= 'z') || \
+								 (IS_HIGHBIT_SET(c)))
+
+#define IS_NODENAME_CHAR(c)		(IS_NODENAME_FIRSTCHAR(c) || (c) == '-' || (c) == '.' || \
+								 ((c) >= '0' && (c) <= '9'))
+
+#define TOKEN_IS_EMPTY(t)		((t).ttype == XPATH_TOKEN_NONE)
+
+/*
+ * Returns next char after last char of token - XPath lexer
+ */
+static char *
+getXPathToken(char *str, XPathTokenInfo * ti)
+{
+	/* skip initial spaces */
+	while (*str == ' ')
+		str++;
+
+	if (*str != '\0')
+	{
+		char		c = *str;
+
+		ti->start = str++;
+
+		if (c >= '0' && c <= '9')
+		{
+			while (*str >= '0' && *str <= '9')
+				str++;
+			if (*str == '.')
+			{
+				str++;
+				while (*str >= '0' && *str <= '9')
+					str++;
+			}
+			ti->ttype = XPATH_TOKEN_NUMBER;
+		}
+		else if (IS_NODENAME_FIRSTCHAR(c))
+		{
+			while (IS_NODENAME_CHAR(*str))
+				str++;
+
+			ti->ttype = XPATH_TOKEN_NAME;
+		}
+		else if (c == '"')
+		{
+			while (*str != '\0')
+				if (*str++ == '"')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else if (c == ':')
+		{
+			/* look ahead to detect a double-colon */
+			if (*str == ':')
+			{
+				ti->ttype = XPATH_TOKEN_DCOLON;
+				str++;
+			}
+			else
+				ti->ttype = XPATH_TOKEN_COLON;
+		}
+		else
+			ti->ttype = XPATH_TOKEN_OTHER;
+
+		ti->length = str - ti->start;
+	}
+	else
+	{
+		ti->start = NULL;
+		ti->length = 0;
+
+		ti->ttype = XPATH_TOKEN_NONE;
+	}
+
+	return str;
+}
+
+/*
+ * reset XPath parser stack
+ */
+static void
+initXPathParser(XPathParserData * parser, char *str)
+{
+	parser->str = str;
+	parser->cur = str;
+	parser->buffer_is_empty = true;
+}
+
+/*
+ * Returns token from stack or read token
+ */
+static void
+nextXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+	{
+		memcpy(ti, &parser->buffer, sizeof(XPathTokenInfo));
+		parser->buffer_is_empty = true;
+	}
+	else
+		parser->cur = getXPathToken(parser->cur, ti);
+}
+
+/*
+ * Push token to stack
+ */
+static void
+pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+		elog(ERROR, "internal error");
+
+	memcpy(&parser->buffer, ti, sizeof(XPathTokenInfo));
+	parser->buffer_is_empty = false;
+	ti->ttype = XPATH_TOKEN_NONE;
+}
+
+/*
+ * Write token to output string
+ */
+static void
+writeXPathToken(StringInfo str, XPathTokenInfo * ti)
+{
+	Assert(ti->ttype != XPATH_TOKEN_NONE);
+
+	if (ti->ttype != XPATH_TOKEN_OTHER)
+		appendBinaryStringInfo(str, ti->start, ti->length);
+	else
+		appendStringInfoChar(str, *ti->start);
+
+	ti->ttype = XPATH_TOKEN_NONE;
+}
+
+/*
+ * This is main part of XPath transformation. It can be called recursivly,
+ * when XPath expression contains predicates.
+ */
+static void
+_transformXPath(StringInfo str, XPathParserData * parser,
+				bool inside_predicate,
+				char *def_namespace_name)
+{
+	XPathTokenInfo t1,
+				t2;
+	bool		tagname_needs_defnsp;
+	bool		token_is_tagattrib = false;
+
+	nextXPathToken(parser, &t1);
+
+	while (t1.ttype != XPATH_TOKEN_NONE)
+	{
+		switch (t1.ttype)
+		{
+			case XPATH_TOKEN_NUMBER:
+			case XPATH_TOKEN_STRING:
+			case XPATH_TOKEN_COLON:
+			case XPATH_TOKEN_DCOLON:
+				/* write without any changes */
+				writeXPathToken(str, &t1);
+				/* process fresh token */
+				nextXPathToken(parser, &t1);
+				break;
+
+			case XPATH_TOKEN_NAME:
+				{
+					/*
+					 * Inside predicate ignore keywords (literal operators)
+					 * "and" "or" "div" and "mod".
+					 */
+					if (inside_predicate)
+					{
+						if ((strncmp(t1.start, "and", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "or", 2) == 0 && t1.length == 2) ||
+						 (strncmp(t1.start, "div", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "mod", 3) == 0 && t1.length == 3))
+						{
+							token_is_tagattrib = false;
+
+							/* keyword */
+							writeXPathToken(str, &t1);
+							/* process fresh token */
+							nextXPathToken(parser, &t1);
+							break;
+						}
+					}
+
+					tagname_needs_defnsp = true;
+
+					nextXPathToken(parser, &t2);
+					if (t2.ttype == XPATH_TOKEN_COLON)
+					{
+						/* t1 is a quilified node name. no need to add default one. */
+						tagname_needs_defnsp = false;
+
+						/* namespace name */
+						writeXPathToken(str, &t1);
+						/* colon */
+						writeXPathToken(str, &t2);
+						/* get node name */
+						nextXPathToken(parser, &t1);
+					}
+					else if (t2.ttype == XPATH_TOKEN_DCOLON)
+					{
+						/* t1 is an axis name. write out as it is */
+						if (strncmp(t1.start, "attribute", 9) == 0 && t1.length == 9)
+							token_is_tagattrib = true;
+
+						/* axis name */
+						writeXPathToken(str, &t1);
+						/* double colon */
+						writeXPathToken(str, &t2);
+
+						/*
+						 * The next token may be qualified tag name, process
+						 * it as a fresh token.
+						 */
+						nextXPathToken(parser, &t1);
+						break;
+					}
+					else if (t2.ttype == XPATH_TOKEN_OTHER)
+					{
+						/* function name doesn't require namespace */
+						if (*t2.start == '(')
+							tagname_needs_defnsp = false;
+						else
+							pushXPathToken(parser, &t2);
+					}
+
+					if (tagname_needs_defnsp && !token_is_tagattrib)
+						appendStringInfo(str, "%s:", def_namespace_name);
+
+					token_is_tagattrib = false;
+
+					/* write maybe-tagname if not consumed yet */
+					if (!TOKEN_IS_EMPTY(t1))
+						writeXPathToken(str, &t1);
+
+					/* output t2 if not consumed yet */
+					if (!TOKEN_IS_EMPTY(t2))
+						writeXPathToken(str, &t2);
+
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_OTHER:
+				{
+					char		c = *t1.start;
+
+					writeXPathToken(str, &t1);
+
+					if (c == '[')
+						_transformXPath(str, parser, true, def_namespace_name);
+					else
+					{
+						if (c == ']' && inside_predicate)
+						{
+							return;
+						}
+						else if (c == '@')
+						{
+							nextXPathToken(parser, &t1);
+							if (t1.ttype == XPATH_TOKEN_NAME)
+								token_is_tagattrib = true;
+
+							pushXPathToken(parser, &t1);
+						}
+					}
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_NONE:
+				elog(ERROR, "should not be here");
+		}
+	}
+}
+
+void
+transformXPath(StringInfo str, char *xpath,
+			   char *def_namespace_name)
+{
+	XPathParserData parser;
+
+	Assert(def_namespace_name != NULL);
+
+	initStringInfo(str);
+	initXPathParser(&parser, xpath);
+	_transformXPath(str, &parser, false, def_namespace_name);
+
+	elog(DEBUG1, "apply default namespace \"%s\"", str->data);
+}
+
+#endif
diff --git a/src/include/utils/xpath_parser.h b/src/include/utils/xpath_parser.h
new file mode 100644
index 0000000000..b2fc239e12
--- /dev/null
+++ b/src/include/utils/xpath_parser.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.h
+ *	  Declarations for XML XPath transformation.
+ *
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/xml.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef XPATH_PARSER_H
+#define XPATH_PARSER_H
+
+#include "postgres.h"
+#include "lib/stringinfo.h"
+
+void transformXPath(StringInfo str, char *xpath, char *def_namespace_name);
+
+#endif   /* XPATH_PARSER_H */
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 7fa1309108..28a8a2a1ad 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1120,7 +1120,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1487,3 +1491,56 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index 970ab26fce..6c30cd3709 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -1337,3 +1337,64 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
 ---
 (0 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+ERROR:  unsupported XML feature
+LINE 1: INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a ...
+                                  ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+(0 rows)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="ht...
+                                                    ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x...
+                                              ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: ...ELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xml...
+                                                             ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index 112ebe47cd..15de697c3a 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -1100,7 +1100,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1467,3 +1471,56 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index cb96e18005..057991a0c4 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -594,3 +594,23 @@ INSERT INTO xmltest2 VALUES('<d><r><dc>2</dc></r></d>', 'D');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
+
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
#15Michael Paquier
michael.paquier@gmail.com
In reply to: Pavel Stehule (#14)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

On Sat, Nov 25, 2017 at 2:32 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

fixed regress test

The last patch still applies, but did not get any reviews.
Horiguchi-san, you are marked as a reviewer of this patch. Could you
look at it? For now, I am moving it to next CF.
--
Michael

#16Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Michael Paquier (#15)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

At Wed, 29 Nov 2017 14:33:08 +0900, Michael Paquier <michael.paquier@gmail.com> wrote in <CAB7nPqROox3HnfjTGhKo4NA97oX8g1DSr0LULWVYsaKHuYqZEw@mail.gmail.com>

On Sat, Nov 25, 2017 at 2:32 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

fixed regress test

The last patch still applies, but did not get any reviews.
Horiguchi-san, you are marked as a reviewer of this patch. Could you
look at it? For now, I am moving it to next CF.

Sorry for the absense. It is near to complete. I'll look this soon.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#17Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Pavel Stehule (#14)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hello, I returned to this.

I thouroughly checked the translator's behavior against the XPath
specifications and checkd out the documentation and regression
test. Almost everything is fine for me and this would be the last
comment from me.

At Fri, 24 Nov 2017 18:32:43 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in <CAFj8pRB7Fs_2DrtUTGhTmQb+KReXPH6SG62hGWO3KVL_eZYCaA@mail.gmail.com>

2017-11-24 18:13 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-11-24 17:53 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

2017-11-22 22:49 GMT+01:00 Thomas Munro <thomas.munro@enterprisedb.com>:

On Thu, Nov 9, 2017 at 10:11 PM, Pavel Stehule <pavel.stehule@gmail.com>
wrote:

Attached new version.

Hi Pavel,

FYI my patch testing robot says[1]:

xml ... FAILED

regression.diffs says:

+ SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
'/rows/row' PASSING t1.doc COLUMNS data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
+ data
+ ------
+ (0 rows)
+

Maybe you forgot to git-add the expected file?

[1] https://travis-ci.org/postgresql-cfbot/postgresql/builds/305979133

unfortunately xml.out has 3 versions and is possible so one version
should be taken elsewhere than my comp.

please can me send your result xml.out file?

looks like this case is without xml support so I can fix on my comp.

fixed regress test

(I wouldn't have found that..)

I have three comments on the behavior and one on documentation.

1. Lack of syntax handling.

["'" [^'] "'"] is also a string literal, but getXPathToken is
forgetting that and applying default namespace mistakenly to the
astring content.

2. Additional comment might be good.

It might be better having additional description about default
namespace in the comment starts from "Namespace mappings are
passed as text[]" in xpth_internal().

3. Inconsistent behavior from named namespace.

| - function context, aliases are <emphasis>local</emphasis>).
| + function context, aliases are <emphasis>local</emphasis>). Default namespace has
| + empty name (empty string) and should be only one.

This works as the description, on the other hand the same
namespace prefix can be defined twice or more in the array and
the last one is in effect. I don't see a reason for
differenciating the default namespace case.

4. Comments on the documentation part.

# Even though I'm not sutable for commenting on wording...

| + Inside predicate literals <literal>and</literal>, <literal>or</literal>,
| + <literal>div</literal> and <literal>mod</literal> are used as keywords
| + (XPath operators) every time and default namespace are not applied there.

*I*'d like to have a comma between the predicate and literals,
and have a 'a' before prediate. Or 'Literals .. inside a
predicate' might be better?

'are used as keywords' might be better being 'are identifed as
keywords'?

Default namespace is applied to tag names except the listed
keywords even inside a predicate. So 'are not applied there'
might be better being 'are not applied to them'? Or 'are not
applied in the case'?

| + If you would to use these literals like tag names, then the default namespace
| + should not be used, and these literals should be explicitly
| + labeled.
| + </para>

Default namespace is not applied *only to* such keywords inside a
predicate. Even if an Xpath expression contains such a tag name,
default namespace still works for other tags. Does the following
make sense?

+ Use named namespace to qualify such tag names appear in an
+ XPath predicate.

===
After the aboves are addressed (even or rejected), I think I
don't have no additional comment.

- This current patch applies safely (with small shifts) on the
current master.

- The code looks fine for me.

- This patch translates the given XPath expression by prefixing
unprefixed tag names with a special namespace prefix only in
the case where default namespace is defined, so the existing
behavior is not affected.

- The syntax is existing but just not implemented so I don't
think no arguemnts needed here.

- It undocumentedly inhibits the usage of the namespace prefix
"pgdefnamespace.pgsqlxml.internal" but I believe no one can
notice that.

- The default-ns translator (xpath_parser.c) seems working
perfectly with some harmless exceptions.

(xpath specifications is here: https://www.w3.org/TR/1999/REC-xpath-19991116/)

Related unused features (and not documented?):
context variables ($n notations),
user-defined functions (or function names prefixed by a namespace prefix)

Newly documented behavior:
the default namespace isn't applied to and/or/div/mod.

- Dodumentation looks enough.

- Regression test doesn't cover the XPath syntax but it's not
viable. I am fine with the basic test cases added by the
current patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#18Pavel Stehule
pavel.stehule@gmail.com
In reply to: Kyotaro HORIGUCHI (#17)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi

2018-01-23 8:13 GMT+01:00 Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp

:

Hello, I returned to this.

I thouroughly checked the translator's behavior against the XPath
specifications and checkd out the documentation and regression
test. Almost everything is fine for me and this would be the last
comment from me.

At Fri, 24 Nov 2017 18:32:43 +0100, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRB7Fs_2DrtUTGhTmQb+KReXPH6SG62hGWO3KVL_eZYCaA@
mail.gmail.com>

2017-11-24 18:13 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-11-24 17:53 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

2017-11-22 22:49 GMT+01:00 Thomas Munro <

thomas.munro@enterprisedb.com>:

On Thu, Nov 9, 2017 at 10:11 PM, Pavel Stehule <

pavel.stehule@gmail.com>

wrote:

Attached new version.

Hi Pavel,

FYI my patch testing robot says[1]:

xml ... FAILED

regression.diffs says:

+ SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
'/rows/row' PASSING t1.doc COLUMNS data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
+ data
+ ------
+ (0 rows)
+

Maybe you forgot to git-add the expected file?

[1] https://travis-ci.org/postgresql-cfbot/postgresql/

builds/305979133

unfortunately xml.out has 3 versions and is possible so one version
should be taken elsewhere than my comp.

please can me send your result xml.out file?

looks like this case is without xml support so I can fix on my comp.

fixed regress test

(I wouldn't have found that..)

I have three comments on the behavior and one on documentation.

1. Lack of syntax handling.

["'" [^'] "'"] is also a string literal, but getXPathToken is
forgetting that and applying default namespace mistakenly to the
astring content.

2. Additional comment might be good.

It might be better having additional description about default
namespace in the comment starts from "Namespace mappings are
passed as text[]" in xpth_internal().

3. Inconsistent behavior from named namespace.

| - function context, aliases are <emphasis>local</emphasis>).
| + function context, aliases are <emphasis>local</emphasis>). Default
namespace has
| + empty name (empty string) and should be only one.

This works as the description, on the other hand the same
namespace prefix can be defined twice or more in the array and
the last one is in effect. I don't see a reason for
differenciating the default namespace case.

4. Comments on the documentation part.

# Even though I'm not sutable for commenting on wording...

| + Inside predicate literals <literal>and</literal>,
<literal>or</literal>,
| + <literal>div</literal> and <literal>mod</literal> are used as
keywords
| + (XPath operators) every time and default namespace are not applied
there.

*I*'d like to have a comma between the predicate and literals,
and have a 'a' before prediate. Or 'Literals .. inside a
predicate' might be better?

'are used as keywords' might be better being 'are identifed as
keywords'?

Default namespace is applied to tag names except the listed
keywords even inside a predicate. So 'are not applied there'
might be better being 'are not applied to them'? Or 'are not
applied in the case'?

| + If you would to use these literals like tag names, then the
default namespace
| + should not be used, and these literals should be explicitly
| + labeled.
| + </para>

Default namespace is not applied *only to* such keywords inside a
predicate. Even if an Xpath expression contains such a tag name,
default namespace still works for other tags. Does the following
make sense?

+ Use named namespace to qualify such tag names appear in an
+ XPath predicate.

please, can you append examples of mentioned issues. I'll fix it faster.

Thank you very much

Pavel

Show quoted text

===
After the aboves are addressed (even or rejected), I think I
don't have no additional comment.

- This current patch applies safely (with small shifts) on the
current master.

- The code looks fine for me.

- This patch translates the given XPath expression by prefixing
unprefixed tag names with a special namespace prefix only in
the case where default namespace is defined, so the existing
behavior is not affected.

- The syntax is existing but just not implemented so I don't
think no arguemnts needed here.

- It undocumentedly inhibits the usage of the namespace prefix
"pgdefnamespace.pgsqlxml.internal" but I believe no one can
notice that.

- The default-ns translator (xpath_parser.c) seems working
perfectly with some harmless exceptions.

(xpath specifications is here: https://www.w3.org/TR/1999/
REC-xpath-19991116/)

Related unused features (and not documented?):
context variables ($n notations),
user-defined functions (or function names prefixed by a namespace
prefix)

Newly documented behavior:
the default namespace isn't applied to and/or/div/mod.

- Dodumentation looks enough.

- Regression test doesn't cover the XPath syntax but it's not
viable. I am fine with the basic test cases added by the
current patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#19Pavel Stehule
pavel.stehule@gmail.com
In reply to: Kyotaro HORIGUCHI (#17)
1 attachment(s)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi

2018-01-23 8:13 GMT+01:00 Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp

:

Hello, I returned to this.

I thouroughly checked the translator's behavior against the XPath
specifications and checkd out the documentation and regression
test. Almost everything is fine for me and this would be the last
comment from me.

At Fri, 24 Nov 2017 18:32:43 +0100, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <CAFj8pRB7Fs_2DrtUTGhTmQb+KReXPH6SG62hGWO3KVL_eZYCaA@
mail.gmail.com>

2017-11-24 18:13 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

2017-11-24 17:53 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:

Hi

2017-11-22 22:49 GMT+01:00 Thomas Munro <

thomas.munro@enterprisedb.com>:

On Thu, Nov 9, 2017 at 10:11 PM, Pavel Stehule <

pavel.stehule@gmail.com>

wrote:

Attached new version.

Hi Pavel,

FYI my patch testing robot says[1]:

xml ... FAILED

regression.diffs says:

+ SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'),
'/rows/row' PASSING t1.doc COLUMNS data int PATH
'child::a[1][attribute::hoge="haha"]') as x;
+ data
+ ------
+ (0 rows)
+

Maybe you forgot to git-add the expected file?

[1] https://travis-ci.org/postgresql-cfbot/postgresql/

builds/305979133

unfortunately xml.out has 3 versions and is possible so one version
should be taken elsewhere than my comp.

please can me send your result xml.out file?

looks like this case is without xml support so I can fix on my comp.

fixed regress test

(I wouldn't have found that..)

I have three comments on the behavior and one on documentation.

1. Lack of syntax handling.

["'" [^'] "'"] is also a string literal, but getXPathToken is
forgetting that and applying default namespace mistakenly to the
astring content.

In this case, I am not sure if I understand well.

Within expressions, literal strings are delimited by single or double
quotation marks, which are also used to delimit XML attributes. To avoid a
quotation mark in an expression being interpreted by the XML processor as
terminating the attribute value the quotation mark can be entered as a
character reference (&quot; or &apos;). Alternatively, the expression can
use single quotation marks if the XML attribute is delimited with double
quotation marks or vice-versa.

So if I understand well, then XML string can looks like ' some " some ' or
" some ' some ". I fixed it.

2. Additional comment might be good.

It might be better having additional description about default
namespace in the comment starts from "Namespace mappings are
passed as text[]" in xpth_internal().

fixed

3. Inconsistent behavior from named namespace.

| - function context, aliases are <emphasis>local</emphasis>).
| + function context, aliases are <emphasis>local</emphasis>). Default
namespace has
| + empty name (empty string) and should be only one.

This works as the description, on the other hand the same
namespace prefix can be defined twice or more in the array and
the last one is in effect. I don't see a reason for
differenciating the default namespace case.

It looks like libxml2 bug. There is no sense to use more than one default
namespace, and although it is inconsistent with other namespaces, I am
thinking so it is correct. Is better to raise error early. In this case
Postgres expects so libxml2 ensure all namespace checks and it is tolerant.
Default namespace is implemented inside Postgres, and I don't see any
advantage of tolerant behave. More default namespaces is disallowed for
XMLTABLE - so some inconsistency there should be. In this case I prefer
raise error to signalize ambiguous or badly formatted input clearly.

4. Comments on the documentation part.

# Even though I'm not sutable for commenting on wording...

| + Inside predicate literals <literal>and</literal>,
<literal>or</literal>,
| + <literal>div</literal> and <literal>mod</literal> are used as
keywords
| + (XPath operators) every time and default namespace are not applied
there.

*I*'d like to have a comma between the predicate and literals,
and have a 'a' before prediate. Or 'Literals .. inside a
predicate' might be better?

fixed

'are used as keywords' might be better being 'are identifed as
keywords'?

fixed

Default namespace is applied to tag names except the listed
keywords even inside a predicate. So 'are not applied there'
might be better being 'are not applied to them'? Or 'are not
applied in the case'?

| + If you would to use these literals like tag names, then the
default namespace
| + should not be used, and these literals should be explicitly
| + labeled.
| + </para>

Default namespace is not applied *only to* such keywords inside a
predicate. Even if an Xpath expression contains such a tag name,
default namespace still works for other tags. Does the following
make sense?

+ Use named namespace to qualify such tag names appear in an
+ XPath predicate.

fixed

I hope so some native speaker finalize doc. It is out of my knowledges
.

===
After the aboves are addressed (even or rejected), I think I
don't have no additional comment.

- This current patch applies safely (with small shifts) on the
current master.

- The code looks fine for me.

- This patch translates the given XPath expression by prefixing
unprefixed tag names with a special namespace prefix only in
the case where default namespace is defined, so the existing
behavior is not affected.

- The syntax is existing but just not implemented so I don't
think no arguemnts needed here.

- It undocumentedly inhibits the usage of the namespace prefix
"pgdefnamespace.pgsqlxml.internal" but I believe no one can
notice that.

- The default-ns translator (xpath_parser.c) seems working
perfectly with some harmless exceptions.

(xpath specifications is here: https://www.w3.org/TR/1999/
REC-xpath-19991116/)

Related unused features (and not documented?):
context variables ($n notations),
user-defined functions (or function names prefixed by a namespace
prefix)

I did some fast check - and these features are not supported by libxml2 -
so it is question if it should be parsed by out xpath parser. So there are
not possible to check it, test it :(

Newly documented behavior:
the default namespace isn't applied to and/or/div/mod.

- Dodumentation looks enough.

- Regression test doesn't cover the XPath syntax but it's not
viable. I am fine with the basic test cases added by the
current patch.

regards,

I am sending updated version.

Very much thanks for very precious review

Pavel

Show quoted text

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

xml-xpath-default-ns-7.patchtext/x-patch; charset=US-ASCII; name=xml-xpath-default-ns-7.patchDownload
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 487c7ff750..3c410c5ac2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10497,7 +10497,8 @@ SELECT xml_is_well_formed_document('<pg:foo xmlns:pg="http://postgresql.org/stuf
      second the namespace URI. It is not required that aliases provided in
      this array be the same as those being used in the XML document itself (in
      other words, both in the XML document and in the <function>xpath</function>
-     function context, aliases are <emphasis>local</emphasis>).
+     function context, aliases are <emphasis>local</emphasis>). Default namespace has
+     empty name (empty string) and should be only one.
     </para>
 
     <para>
@@ -10513,11 +10514,18 @@ SELECT xpath('/my:a/text()', '<my:a xmlns:my="http://example.com">test</my:a>',
 ]]></screen>
     </para>
 
+    <para>
+     Inside a predicate, literals <literal>and</literal>, <literal>or</literal>,
+     <literal>div</literal> and <literal>mod</literal> are identified as keywords
+     (XPath operators) every time and default namespace are not applied in the case.
+     Use named namespace to qualify such tag names appear in an XPath predicate.
+    </para>
+
     <para>
      To deal with default (anonymous) namespaces, do something like this:
 <screen><![CDATA[
-SELECT xpath('//mydefns:b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
-             ARRAY[ARRAY['mydefns', 'http://example.com']]);
+SELECT xpath('//b/text()', '<a xmlns="http://example.com"><b>test</b></a>',
+             ARRAY[ARRAY['', 'http://example.com']]);
 
  xpath
 --------
@@ -10591,8 +10599,7 @@ SELECT xpath_exists('/my:a/text()', '<my:a xmlns:my="http://example.com">test</m
     <para>
      The optional <literal>XMLNAMESPACES</literal> clause is a comma-separated
      list of namespaces.  It specifies the XML namespaces used in
-     the document and their aliases. A default namespace specification
-     is not currently supported.
+     the document and their aliases.
     </para>
 
     <para>
diff --git a/src/backend/utils/adt/Makefile b/src/backend/utils/adt/Makefile
index 1fb018416e..b60a3cfe0d 100644
--- a/src/backend/utils/adt/Makefile
+++ b/src/backend/utils/adt/Makefile
@@ -29,7 +29,7 @@ OBJS = acl.o amutils.o arrayfuncs.o array_expanded.o array_selfuncs.o \
 	tsquery_op.o tsquery_rewrite.o tsquery_util.o tsrank.o \
 	tsvector.o tsvector_op.o tsvector_parser.o \
 	txid.o uuid.o varbit.o varchar.o varlena.o version.o \
-	windowfuncs.o xid.o xml.o
+	windowfuncs.o xid.o xml.o xpath_parser.o
 
 like.o: like.c like_match.c
 
diff --git a/src/backend/utils/adt/xml.c b/src/backend/utils/adt/xml.c
index 7cdb87ef85..7bcffdc442 100644
--- a/src/backend/utils/adt/xml.c
+++ b/src/backend/utils/adt/xml.c
@@ -90,7 +90,7 @@
 #include "utils/rel.h"
 #include "utils/syscache.h"
 #include "utils/xml.h"
-
+#include "utils/xpath_parser.h"
 
 /* GUC variables */
 int			xmlbinary;
@@ -187,6 +187,7 @@ typedef struct XmlTableBuilderData
 	xmlXPathCompExprPtr xpathcomp;
 	xmlXPathObjectPtr xpathobj;
 	xmlXPathCompExprPtr *xpathscomp;
+	bool		with_default_ns;
 } XmlTableBuilderData;
 #endif
 
@@ -227,6 +228,7 @@ const TableFuncRoutine XmlTableRoutine =
 #define NAMESPACE_XSI "http://www.w3.org/2001/XMLSchema-instance"
 #define NAMESPACE_SQLXML "http://standards.iso.org/iso/9075/2003/sqlxml"
 
+#define DEFAULT_NAMESPACE_NAME		"pgdefnamespace.pgsqlxml.internal"
 
 #ifdef USE_LIBXML
 
@@ -3850,6 +3852,7 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 	int			ndim;
 	Datum	   *ns_names_uris;
 	bool	   *ns_names_uris_nulls;
+	bool		with_default_ns = false;
 	int			ns_count;
 
 	/*
@@ -3860,6 +3863,8 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 	 * first element defining the name, the second one the URI.  Example:
 	 * ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2',
 	 * 'http://example2.com']].
+	 * When the name is empty string, then URI is used as default namespace.
+	 * Example: ARRAY[ARRAY['', 'http://x.y]]
 	 */
 	ndim = namespaces ? ARR_NDIM(namespaces) : 0;
 	if (ndim != 0)
@@ -3899,7 +3904,6 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 				 errmsg("empty XPath expression")));
 
 	string = pg_xmlCharStrndup(datastr, len);
-	xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
 
 	/*
 	 * In a UTF8 database, skip any xml declaration, which might assert
@@ -3953,6 +3957,26 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 							(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
 							 errmsg("neither namespace name nor URI may be null")));
 				ns_name = TextDatumGetCString(ns_names_uris[i * 2]);
+
+				/* Don't allow same namespace as out internal default namespace name */
+				if (strcmp(ns_name, DEFAULT_NAMESPACE_NAME) == 0)
+					ereport(ERROR,
+								(errcode(ERRCODE_RESERVED_NAME),
+								 errmsg("cannot to use \"%s\" as namespace name",
+										  DEFAULT_NAMESPACE_NAME),
+								 errdetail("\"%s\" is reserved for internal purpose",
+										  DEFAULT_NAMESPACE_NAME)));
+				if (*ns_name == '\0')
+				{
+					if (with_default_ns)
+						ereport(ERROR,
+								(errcode(ERRCODE_SYNTAX_ERROR),
+								 errmsg("only one default namespace is allowed")));
+
+					with_default_ns = true;
+					ns_name = DEFAULT_NAMESPACE_NAME;
+				}
+
 				ns_uri = TextDatumGetCString(ns_names_uris[i * 2 + 1]);
 				if (xmlXPathRegisterNs(xpathctx,
 									   (xmlChar *) ns_name,
@@ -3963,6 +3987,16 @@ xpath_internal(text *xpath_expr_text, xmltype *data, ArrayType *namespaces,
 			}
 		}
 
+		if (with_default_ns)
+		{
+			StringInfoData		str;
+
+			transformXPath(&str, text_to_cstring(xpath_expr_text), DEFAULT_NAMESPACE_NAME);
+			xpath_expr = pg_xmlCharStrndup(str.data, str.len);
+		}
+		else
+			xpath_expr = pg_xmlCharStrndup(VARDATA_ANY(xpath_expr_text), xpath_len);
+
 		xpathcomp = xmlXPathCompile(xpath_expr);
 		if (xpathcomp == NULL || xmlerrcxt->err_occurred)
 			xml_ereport(xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
@@ -4207,6 +4241,7 @@ XmlTableInitOpaque(TableFuncScanState *state, int natts)
 	xtCxt->magic = XMLTABLE_CONTEXT_MAGIC;
 	xtCxt->natts = natts;
 	xtCxt->xpathscomp = palloc0(sizeof(xmlXPathCompExprPtr) * natts);
+	xtCxt->with_default_ns = false;
 
 	xmlerrcxt = pg_xml_init(PG_XML_STRICTNESS_ALL);
 
@@ -4299,6 +4334,7 @@ XmlTableSetDocument(TableFuncScanState *state, Datum value)
 #endif							/* not USE_LIBXML */
 }
 
+
 /*
  * XmlTableSetNamespace
  *		Add a namespace declaration
@@ -4309,12 +4345,25 @@ XmlTableSetNamespace(TableFuncScanState *state, const char *name, const char *ur
 #ifdef USE_LIBXML
 	XmlTableBuilderData *xtCxt;
 
-	if (name == NULL)
-		ereport(ERROR,
-				(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-				 errmsg("DEFAULT namespace is not supported")));
 	xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableSetNamespace");
 
+	if (name != NULL)
+	{
+		/* Don't allow same namespace as out internal default namespace name */
+		if (strcmp(name, DEFAULT_NAMESPACE_NAME) == 0)
+			ereport(ERROR,
+						(errcode(ERRCODE_RESERVED_NAME),
+						 errmsg("cannot to use \"%s\" as namespace name",
+								  DEFAULT_NAMESPACE_NAME),
+						 errdetail("\"%s\" is reserved for internal purpose",
+								  DEFAULT_NAMESPACE_NAME)));
+	}
+	else
+	{
+		xtCxt->with_default_ns = true;
+		name = DEFAULT_NAMESPACE_NAME;
+	}
+
 	if (xmlXPathRegisterNs(xtCxt->xpathcxt,
 						   pg_xmlCharStrndup(name, strlen(name)),
 						   pg_xmlCharStrndup(uri, strlen(uri))))
@@ -4343,6 +4392,14 @@ XmlTableSetRowFilter(TableFuncScanState *state, const char *path)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("row path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathcomp = xmlXPathCompile(xstr);
@@ -4374,6 +4431,14 @@ XmlTableSetColumnFilter(TableFuncScanState *state, const char *path, int colnum)
 				(errcode(ERRCODE_DATA_EXCEPTION),
 				 errmsg("column path filter must not be empty string")));
 
+	if (xtCxt->with_default_ns)
+	{
+		StringInfoData		str;
+
+		transformXPath(&str, path, DEFAULT_NAMESPACE_NAME);
+		path = str.data;
+	}
+
 	xstr = pg_xmlCharStrndup(path, strlen(path));
 
 	xtCxt->xpathscomp[colnum] = xmlXPathCompile(xstr);
diff --git a/src/backend/utils/adt/xpath_parser.c b/src/backend/utils/adt/xpath_parser.c
new file mode 100644
index 0000000000..dff9fa60a8
--- /dev/null
+++ b/src/backend/utils/adt/xpath_parser.c
@@ -0,0 +1,369 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.c
+ *	  XML XPath parser.
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/backend/utils/adt/xpath_parser.c
+ *-------------------------------------------------------------------------
+ */
+#include "postgres.h"
+
+#include "utils/xpath_parser.h"
+
+/*
+ * All PostgreSQL XML related functionality is based on libxml2 library, and
+ * XPath support is not an exception.  However, libxml2 doesn't support
+ * default namespace for XPath expressions. Because there are not any API
+ * how to transform or access to parsed XPath expression we have to parse
+ * XPath here.
+ *
+ * Those functionalities are implemented with a simple XPath parser/
+ * preprocessor.  This XPath parser transforms a XPath expression to another
+ * XPath expression that can be used by libxml2 XPath evaluation. It doesn't
+ * replace libxml2 XPath parser or libxml2 XPath expression evaluation.
+ */
+
+#ifdef USE_LIBXML
+
+/*
+ * We need to work with XPath expression tokens.  When expression starting with
+ * nodename, then we can use prefix.  When default namespace is defined, then we
+ * should to enhance any nodename and attribute without namespace by default
+ * namespace.
+ */
+
+typedef enum
+{
+	XPATH_TOKEN_NONE,
+	XPATH_TOKEN_NAME,
+	XPATH_TOKEN_STRING,
+	XPATH_TOKEN_NUMBER,
+	XPATH_TOKEN_COLON,
+	XPATH_TOKEN_DCOLON,
+	XPATH_TOKEN_OTHER
+}	XPathTokenType;
+
+typedef struct XPathTokenInfo
+{
+	XPathTokenType ttype;
+	const char	   *start;
+	int			length;
+}	XPathTokenInfo;
+
+typedef struct ParserData
+{
+	const char	   *str;
+	const char	   *cur;
+	XPathTokenInfo buffer;
+	bool		buffer_is_empty;
+}	XPathParserData;
+
+/* Any high-bit-set character is OK (might be part of a multibyte char) */
+#define IS_NODENAME_FIRSTCHAR(c)	 ((c) == '_' || \
+								 ((c) >= 'A' && (c) <= 'Z') || \
+								 ((c) >= 'a' && (c) <= 'z') || \
+								 (IS_HIGHBIT_SET(c)))
+
+#define IS_NODENAME_CHAR(c)		(IS_NODENAME_FIRSTCHAR(c) || (c) == '-' || (c) == '.' || \
+								 ((c) >= '0' && (c) <= '9'))
+
+#define TOKEN_IS_EMPTY(t)		((t).ttype == XPATH_TOKEN_NONE)
+
+/*
+ * Returns next char after last char of token - XPath lexer
+ */
+static const char *
+getXPathToken(const char *str, XPathTokenInfo * ti)
+{
+	/* skip initial spaces */
+	while (*str == ' ')
+		str++;
+
+	if (*str != '\0')
+	{
+		char		c = *str;
+
+		ti->start = str++;
+
+		if (c >= '0' && c <= '9')
+		{
+			while (*str >= '0' && *str <= '9')
+				str++;
+			if (*str == '.')
+			{
+				str++;
+				while (*str >= '0' && *str <= '9')
+					str++;
+			}
+			ti->ttype = XPATH_TOKEN_NUMBER;
+		}
+		else if (IS_NODENAME_FIRSTCHAR(c))
+		{
+			while (IS_NODENAME_CHAR(*str))
+				str++;
+
+			ti->ttype = XPATH_TOKEN_NAME;
+		}
+		else if (c == '"')
+		{
+			while (*str != '\0')
+				if (*str++ == '"')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else if (c == '\'')
+		{
+			while (*str != '\0')
+				if (*str++ == '\'')
+					break;
+
+			ti->ttype = XPATH_TOKEN_STRING;
+		}
+		else if (c == ':')
+		{
+			/* look ahead to detect a double-colon */
+			if (*str == ':')
+			{
+				ti->ttype = XPATH_TOKEN_DCOLON;
+				str++;
+			}
+			else
+				ti->ttype = XPATH_TOKEN_COLON;
+		}
+		else
+			ti->ttype = XPATH_TOKEN_OTHER;
+
+		ti->length = str - ti->start;
+	}
+	else
+	{
+		ti->start = NULL;
+		ti->length = 0;
+
+		ti->ttype = XPATH_TOKEN_NONE;
+	}
+
+	return str;
+}
+
+/*
+ * reset XPath parser stack
+ */
+static void
+initXPathParser(XPathParserData * parser, const char *str)
+{
+	parser->str = str;
+	parser->cur = str;
+	parser->buffer_is_empty = true;
+}
+
+/*
+ * Returns token from stack or read token
+ */
+static void
+nextXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+	{
+		memcpy(ti, &parser->buffer, sizeof(XPathTokenInfo));
+		parser->buffer_is_empty = true;
+	}
+	else
+		parser->cur = getXPathToken(parser->cur, ti);
+}
+
+/*
+ * Push token to stack
+ */
+static void
+pushXPathToken(XPathParserData * parser, XPathTokenInfo * ti)
+{
+	if (!parser->buffer_is_empty)
+		elog(ERROR, "internal error");
+
+	memcpy(&parser->buffer, ti, sizeof(XPathTokenInfo));
+	parser->buffer_is_empty = false;
+	ti->ttype = XPATH_TOKEN_NONE;
+}
+
+/*
+ * Write token to output string
+ */
+static void
+writeXPathToken(StringInfo str, XPathTokenInfo * ti)
+{
+	Assert(ti->ttype != XPATH_TOKEN_NONE);
+
+	if (ti->ttype != XPATH_TOKEN_OTHER)
+		appendBinaryStringInfo(str, ti->start, ti->length);
+	else
+		appendStringInfoChar(str, *ti->start);
+
+	ti->ttype = XPATH_TOKEN_NONE;
+}
+
+/*
+ * This is main part of XPath transformation. It can be called recursivly,
+ * when XPath expression contains predicates.
+ */
+static void
+_transformXPath(StringInfo str, XPathParserData * parser,
+				bool inside_predicate,
+				char *def_namespace_name)
+{
+	XPathTokenInfo t1,
+				t2;
+	bool		tagname_needs_defnsp;
+	bool		token_is_tagattrib = false;
+
+	nextXPathToken(parser, &t1);
+
+	while (t1.ttype != XPATH_TOKEN_NONE)
+	{
+		switch (t1.ttype)
+		{
+			case XPATH_TOKEN_NUMBER:
+			case XPATH_TOKEN_STRING:
+			case XPATH_TOKEN_COLON:
+			case XPATH_TOKEN_DCOLON:
+				/* write without any changes */
+				writeXPathToken(str, &t1);
+				/* process fresh token */
+				nextXPathToken(parser, &t1);
+				break;
+
+			case XPATH_TOKEN_NAME:
+				{
+					/*
+					 * Inside predicate ignore keywords (literal operators)
+					 * "and" "or" "div" and "mod".
+					 */
+					if (inside_predicate)
+					{
+						if ((strncmp(t1.start, "and", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "or", 2) == 0 && t1.length == 2) ||
+						 (strncmp(t1.start, "div", 3) == 0 && t1.length == 3) ||
+						 (strncmp(t1.start, "mod", 3) == 0 && t1.length == 3))
+						{
+							token_is_tagattrib = false;
+
+							/* keyword */
+							writeXPathToken(str, &t1);
+							/* process fresh token */
+							nextXPathToken(parser, &t1);
+							break;
+						}
+					}
+
+					tagname_needs_defnsp = true;
+
+					nextXPathToken(parser, &t2);
+					if (t2.ttype == XPATH_TOKEN_COLON)
+					{
+						/* t1 is a quilified node name. no need to add default one. */
+						tagname_needs_defnsp = false;
+
+						/* namespace name */
+						writeXPathToken(str, &t1);
+						/* colon */
+						writeXPathToken(str, &t2);
+						/* get node name */
+						nextXPathToken(parser, &t1);
+					}
+					else if (t2.ttype == XPATH_TOKEN_DCOLON)
+					{
+						/* t1 is an axis name. write out as it is */
+						if (strncmp(t1.start, "attribute", 9) == 0 && t1.length == 9)
+							token_is_tagattrib = true;
+
+						/* axis name */
+						writeXPathToken(str, &t1);
+						/* double colon */
+						writeXPathToken(str, &t2);
+
+						/*
+						 * The next token may be qualified tag name, process
+						 * it as a fresh token.
+						 */
+						nextXPathToken(parser, &t1);
+						break;
+					}
+					else if (t2.ttype == XPATH_TOKEN_OTHER)
+					{
+						/* function name doesn't require namespace */
+						if (*t2.start == '(')
+							tagname_needs_defnsp = false;
+						else
+							pushXPathToken(parser, &t2);
+					}
+
+					if (tagname_needs_defnsp && !token_is_tagattrib)
+						appendStringInfo(str, "%s:", def_namespace_name);
+
+					token_is_tagattrib = false;
+
+					/* write maybe-tagname if not consumed yet */
+					if (!TOKEN_IS_EMPTY(t1))
+						writeXPathToken(str, &t1);
+
+					/* output t2 if not consumed yet */
+					if (!TOKEN_IS_EMPTY(t2))
+						writeXPathToken(str, &t2);
+
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_OTHER:
+				{
+					char		c = *t1.start;
+
+					writeXPathToken(str, &t1);
+
+					if (c == '[')
+						_transformXPath(str, parser, true, def_namespace_name);
+					else
+					{
+						if (c == ']' && inside_predicate)
+						{
+							return;
+						}
+						else if (c == '@')
+						{
+							nextXPathToken(parser, &t1);
+							if (t1.ttype == XPATH_TOKEN_NAME)
+								token_is_tagattrib = true;
+
+							pushXPathToken(parser, &t1);
+						}
+					}
+					nextXPathToken(parser, &t1);
+				}
+				break;
+
+			case XPATH_TOKEN_NONE:
+				elog(ERROR, "should not be here");
+		}
+	}
+}
+
+void
+transformXPath(StringInfo str, const char *xpath,
+			   char *def_namespace_name)
+{
+	XPathParserData parser;
+
+	Assert(def_namespace_name != NULL);
+
+	initStringInfo(str);
+	initXPathParser(&parser, xpath);
+	_transformXPath(str, &parser, false, def_namespace_name);
+
+	elog(DEBUG1, "apply default namespace \"%s\"", str->data);
+}
+
+#endif
diff --git a/src/include/utils/xpath_parser.h b/src/include/utils/xpath_parser.h
new file mode 100644
index 0000000000..57d37df312
--- /dev/null
+++ b/src/include/utils/xpath_parser.h
@@ -0,0 +1,23 @@
+/*-------------------------------------------------------------------------
+ *
+ * xpath_parser.h
+ *	  Declarations for XML XPath transformation.
+ *
+ *
+ * Portions Copyright (c) 1996-2016, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
+ *
+ * src/include/utils/xml.h
+ *
+ *-------------------------------------------------------------------------
+ */
+
+#ifndef XPATH_PARSER_H
+#define XPATH_PARSER_H
+
+#include "postgres.h"
+#include "lib/stringinfo.h"
+
+void transformXPath(StringInfo str, const char *xpath, char *def_namespace_name);
+
+#endif   /* XPATH_PARSER_H */
diff --git a/src/test/regress/expected/xml.out b/src/test/regress/expected/xml.out
index 7fa1309108..fe8374d14b 100644
--- a/src/test/regress/expected/xml.out
+++ b/src/test/regress/expected/xml.out
@@ -1120,7 +1120,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1487,3 +1491,69 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- both string separators are supported
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge="haha"]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge=''haha'']') AS x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/expected/xml_1.out b/src/test/regress/expected/xml_1.out
index 970ab26fce..3aa32d51d5 100644
--- a/src/test/regress/expected/xml_1.out
+++ b/src/test/regress/expected/xml_1.out
@@ -1337,3 +1337,75 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
 ---
 (0 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+ERROR:  unsupported XML feature
+LINE 1: INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a ...
+                                  ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+(0 rows)
+
+-- both string separators are supported
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge="haha"]') AS x;
+ data 
+------
+(0 rows)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge=''haha'']') AS x;
+ data 
+------
+(0 rows)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+(0 rows)
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="ht...
+                                                    ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x...
+                                              ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: ...ELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xml...
+                                                             ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  unsupported XML feature
+LINE 1: SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="h...
+                                                     ^
+DETAIL:  This functionality requires the server to be built with libxml support.
+HINT:  You need to rebuild PostgreSQL using --with-libxml.
diff --git a/src/test/regress/expected/xml_2.out b/src/test/regress/expected/xml_2.out
index 112ebe47cd..58b46bcc24 100644
--- a/src/test/regress/expected/xml_2.out
+++ b/src/test/regress/expected/xml_2.out
@@ -1100,7 +1100,11 @@ SELECT * FROM XMLTABLE(XMLNAMESPACES(DEFAULT 'http://x.y'),
                       '/rows/row'
                       PASSING '<rows xmlns="http://x.y"><row><a>10</a></row></rows>'
                       COLUMNS a int PATH 'a');
-ERROR:  DEFAULT namespace is not supported
+ a  
+----
+ 10
+(1 row)
+
 -- used in prepare statements
 PREPARE pp AS
 SELECT  xmltable.*
@@ -1467,3 +1471,69 @@ SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c
  14
 (4 rows)
 
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+ data 
+------
+   50
+(1 row)
+
+-- both string separators are supported
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge="haha"]') AS x;
+ data 
+------
+   50
+(1 row)
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge=''haha'']') AS x;
+ data 
+------
+   50
+(1 row)
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+ERROR:  cannot to use "pgdefnamespace.pgsqlxml.internal" as namespace name
+DETAIL:  "pgdefnamespace.pgsqlxml.internal" is reserved for internal purpose
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+                      xpath                       
+--------------------------------------------------
+ {"<a xmlns=\"http://x.y\" hoge=\"haha\">50</a>"}
+(1 row)
+
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+ xpath_exists 
+--------------
+ t
+(1 row)
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
+ERROR:  only one default namespace is allowed
diff --git a/src/test/regress/sql/xml.sql b/src/test/regress/sql/xml.sql
index cb96e18005..55b49d6dba 100644
--- a/src/test/regress/sql/xml.sql
+++ b/src/test/regress/sql/xml.sql
@@ -594,3 +594,27 @@ INSERT INTO xmltest2 VALUES('<d><r><dc>2</dc></r></d>', 'D');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable('/d/r' PASSING x COLUMNS a int PATH '' || lower(_path) || 'c');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH '.');
 SELECT xmltable.* FROM xmltest2, LATERAL xmltable(('/d/r/' || lower(_path) || 'c') PASSING x COLUMNS a int PATH 'x' DEFAULT ascii(_path) - 54);
+
+-- default namespaces
+CREATE TABLE t1 (id int, doc xml);
+INSERT INTO t1 VALUES (5, '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>');
+
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS x), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[1][@hoge]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'child::a[1][attribute::hoge="haha"]') as x;
+
+-- both string separators are supported
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge="haha"]') AS x;
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES(DEFAULT 'http://x.y'), '/rows/row' PASSING t1.doc COLUMNS data int PATH 'a[@hoge=''haha'']') AS x;
+
+-- should fail
+SELECT x.* FROM t1, xmltable(XMLNAMESPACES('http://x.y' AS "pgdefnamespace.pgsqlxml.internal"), '/x:rows/x:row' PASSING t1.doc COLUMNS data int PATH 'x:a[1][@hoge]') AS x;
+
+-- xpath and xpath_exists supports namespaces too
+SELECT xpath('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+SELECT xpath_exists('/x:rows/x:row/x:a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['x', 'http://x.y']]);
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y']]);
+
+-- should fail
+SELECT xpath_exists('/rows/row/a[1][@hoge]', '<rows xmlns="http://x.y"><row><a hoge="haha">50</a></row></rows>', ARRAY[ARRAY['', 'http://x.y'], ARRAY['', 'http://x.z']]);
#20Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Pavel Stehule (#19)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hello.

At Wed, 24 Jan 2018 10:30:39 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in <CAFj8pRBVUVvG1CXxgrs0UipTziUX6M788z-=L9gQvwAB4UGLeg@mail.gmail.com>

Hi

2018-01-23 8:13 GMT+01:00 Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp

I have three comments on the behavior and one on documentation.

1. Lack of syntax handling.

["'" [^'] "'"] is also a string literal, but getXPathToken is
forgetting that and applying default namespace mistakenly to the
astring content.

In this case, I am not sure if I understand well.

Within expressions, literal strings are delimited by single or double
quotation marks, which are also used to delimit XML attributes. To avoid a
quotation mark in an expression being interpreted by the XML processor as
terminating the attribute value the quotation mark can be entered as a
character reference (&quot; or &apos;). Alternatively, the expression can
use single quotation marks if the XML attribute is delimited with double
quotation marks or vice-versa.

I think it is correct understanding.

So if I understand well, then XML string can looks like ' some " some ' or
" some ' some ". I fixed it.

Thanks. It looks good.

2. Additional comment might be good.

It might be better having additional description about default
namespace in the comment starts from "Namespace mappings are
passed as text[]" in xpth_internal().

fixed

Thanks.

3. Inconsistent behavior from named namespace.

| - function context, aliases are <emphasis>local</emphasis>).
| + function context, aliases are <emphasis>local</emphasis>). Default
namespace has
| + empty name (empty string) and should be only one.

This works as the description, on the other hand the same
namespace prefix can be defined twice or more in the array and
the last one is in effect. I don't see a reason for
differenciating the default namespace case.

It looks like libxml2 bug. There is no sense to use more than one default
namespace, and although it is inconsistent with other namespaces, I am
thinking so it is correct. Is better to raise error early. In this case
Postgres expects so libxml2 ensure all namespace checks and it is tolerant.
Default namespace is implemented inside Postgres, and I don't see any
advantage of tolerant behave. More default namespaces is disallowed for
XMLTABLE - so some inconsistency there should be. In this case I prefer
raise error to signalize ambiguous or badly formatted input clearly.

Ok. I'm fine with that.

4. Comments on the documentation part.

# Even though I'm not sutable for commenting on wording...

| + Inside predicate literals <literal>and</literal>,
<literal>or</literal>,
| + <literal>div</literal> and <literal>mod</literal> are used as
keywords
| + (XPath operators) every time and default namespace are not applied
there.

*I*'d like to have a comma between the predicate and literals,
and have a 'a' before prediate. Or 'Literals .. inside a
predicate' might be better?

fixed

'are used as keywords' might be better being 'are identifed as
keywords'?

fixed

Default namespace is applied to tag names except the listed
keywords even inside a predicate. So 'are not applied there'
might be better being 'are not applied to them'? Or 'are not
applied in the case'?

| + If you would to use these literals like tag names, then the
default namespace
| + should not be used, and these literals should be explicitly
| + labeled.
| + </para>

Default namespace is not applied *only to* such keywords inside a
predicate. Even if an Xpath expression contains such a tag name,
default namespace still works for other tags. Does the following
make sense?

+ Use named namespace to qualify such tag names appear in an
+ XPath predicate.

fixed

I hope so some native speaker finalize doc. It is out of my knowledges
.

I am also anxious for that.

===
After the aboves are addressed (even or rejected), I think I
don't have no additional comment.

- This current patch applies safely (with small shifts) on the
current master.

- The code looks fine for me.

- This patch translates the given XPath expression by prefixing
unprefixed tag names with a special namespace prefix only in
the case where default namespace is defined, so the existing
behavior is not affected.

- The syntax is existing but just not implemented so I don't
think no arguemnts needed here.

- It undocumentedly inhibits the usage of the namespace prefix
"pgdefnamespace.pgsqlxml.internal" but I believe no one can
notice that.

- The default-ns translator (xpath_parser.c) seems working
perfectly with some harmless exceptions.

(xpath specifications is here: https://www.w3.org/TR/1999/
REC-xpath-19991116/)

Related unused features (and not documented?):
context variables ($n notations),
user-defined functions (or function names prefixed by a namespace
prefix)

I did some fast check - and these features are not supported by libxml2 -
so it is question if it should be parsed by out xpath parser. So there are
not possible to check it, test it :(

I'm fine with that. I don't think that test for them is needed
since PostgreSQL doesn't support them anyway. Sorry for the
confusing comment. (libxml2 complains about that in the following
way.)

| =# select xpath('/a/text()=$', '<a>test</a>');
| ERROR: invalid XPath expression
!| DETAIL: Expected $ for variable reference
| CONTEXT: SQL function "xpath" statement 1

| =# select xpath('func(/a/text())', '<a>test</a>');
| ERROR: could not create XPath object
!| DETAIL: Unregistered function
| CONTEXT: SQL function "xpath" statement 1

Newly documented behavior:
the default namespace isn't applied to and/or/div/mod.

- Dodumentation looks enough.

- Regression test doesn't cover the XPath syntax but it's not
viable. I am fine with the basic test cases added by the
current patch.

regards,

I am sending updated version.

Very much thanks for very precious review

It's my pleasure. Sorry for my slow responses.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#21Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Kyotaro HORIGUCHI (#20)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hello. I reviewed this and think that this is in Ready for
Committer stage.

The patch is available here.

/messages/by-id/CAFj8pRBVUVvG1CXxgrs0UipTziUX6M788z-=L9gQvwAB4UGLeg@mail.gmail.com

The following list consists of the same items in upthread message
as confirmation.

- This applies to the current master HEAD cleanly.

- The code looks fine.

- This patch translates the given XPath expression by prefixing
unprefixed tag names with a special namespace prefix only in
the case where default namespace is defined, so the existing
behavior is not affected.

- The syntax of default namespace is existing but just not usable
so I don't think no arguemnts needed here.

- It undocumentedly inhibits the usage of the namespace prefix
"pgdefnamespace.pgsqlxml.internal" but I believe no one can
notice that.

- The default-ns translator (xpath_parser.c) seems working
perfectly with some harmless exceptions. (Behavior about
context variables and user-defined xml functions, which are not
handled by PostgreSQL.)

- Dodumentation looks enough.

- Regression test doesn't cover the XPath syntax but I think it's
not viable. I am fine with the basic test cases added by the
current patch.

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

#22Andrew Dunstan
andrew.dunstan@2ndquadrant.com
In reply to: Pavel Stehule (#19)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

On 01/24/2018 04:30 AM, Pavel Stehule wrote:

I am sending updated version.

Very much thanks for very precious review

Thomas,

I am unable to replicate the Linux failure seen in the cfbot on my
Fedora machine. Both when building with libxml2 and without, after
applying the latest patch the tests pass without error. Can you please
investigate what's going on here?

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#23Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Andrew Dunstan (#22)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

On Fri, Aug 10, 2018 at 6:26 AM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:

On 01/24/2018 04:30 AM, Pavel Stehule wrote:

I am sending updated version.

Very much thanks for very precious review

Thomas,

I am unable to replicate the Linux failure seen in the cfbot on my
Fedora machine. Both when building with libxml2 and without, after
applying the latest patch the tests pass without error. Can you please
investigate what's going on here?

Well this is strange... I can't reproduce the problem either with or
without --with-libxml on a Debian box (was trying to get fairly close
to the OS that Travis runs on). But I see the same failure when I
apply the patch on my FreeBSD 12 laptop and test without
--with-libxml. Note that when cfbot runs it, the patch is applied
with FreeBSD patch, and then it's tested without --with-libxml on
Ubuntu (Travis's default OS). [Side note: I should change it to build
--with-libxml, but that's not the point.] So the common factor is a
different patch implementation. I wonder if a hunk is being
misinterpreted.

--
Thomas Munro
http://www.enterprisedb.com

#24Pavel Stehule
pavel.stehule@gmail.com
In reply to: Thomas Munro (#23)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

po 17. 9. 2018 v 2:05 odesílatel Thomas Munro <thomas.munro@enterprisedb.com>
napsal:

On Fri, Aug 10, 2018 at 6:26 AM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:

On 01/24/2018 04:30 AM, Pavel Stehule wrote:

I am sending updated version.

Very much thanks for very precious review

Thomas,

I am unable to replicate the Linux failure seen in the cfbot on my
Fedora machine. Both when building with libxml2 and without, after
applying the latest patch the tests pass without error. Can you please
investigate what's going on here?

Well this is strange... I can't reproduce the problem either with or
without --with-libxml on a Debian box (was trying to get fairly close
to the OS that Travis runs on). But I see the same failure when I
apply the patch on my FreeBSD 12 laptop and test without
--with-libxml. Note that when cfbot runs it, the patch is applied
with FreeBSD patch, and then it's tested without --with-libxml on
Ubuntu (Travis's default OS). [Side note: I should change it to build
--with-libxml, but that's not the point.] So the common factor is a
different patch implementation. I wonder if a hunk is being
misinterpreted.

This patch is not too large. Please, can me send a related files, I can
check it manually.

Regards

Pavel

Show quoted text

--
Thomas Munro
http://www.enterprisedb.com

#25Thomas Munro
thomas.munro@enterprisedb.com
In reply to: Pavel Stehule (#24)
2 attachment(s)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

On Mon, Sep 17, 2018 at 5:36 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:

po 17. 9. 2018 v 2:05 odesílatel Thomas Munro <thomas.munro@enterprisedb.com> napsal:

On Fri, Aug 10, 2018 at 6:26 AM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:

On 01/24/2018 04:30 AM, Pavel Stehule wrote:

I am sending updated version.

Very much thanks for very precious review

Thomas,

I am unable to replicate the Linux failure seen in the cfbot on my
Fedora machine. Both when building with libxml2 and without, after
applying the latest patch the tests pass without error. Can you please
investigate what's going on here?

Well this is strange... I can't reproduce the problem either with or
without --with-libxml on a Debian box (was trying to get fairly close
to the OS that Travis runs on). But I see the same failure when I
apply the patch on my FreeBSD 12 laptop and test without
--with-libxml. Note that when cfbot runs it, the patch is applied
with FreeBSD patch, and then it's tested without --with-libxml on
Ubuntu (Travis's default OS). [Side note: I should change it to build
--with-libxml, but that's not the point.] So the common factor is a
different patch implementation. I wonder if a hunk is being
misinterpreted.

This patch is not too large. Please, can me send a related files, I can check it manually.

I confirmed that xml_1.out is different depending on which 'patch' you
use. I've attached the output from FreeBSD patch 2.0-12u11 and GNU
patch 2.5.8. It's an interesting phenomenon, probably due to having a
huge long file with a lot of repeated text and a slightly different
algorithms or parameters, but I don't think you need to worry about it
for this. Sorry for the distraction.

--
Thomas Munro
http://www.enterprisedb.com

Attachments:

xml_1.out.bsdapplication/octet-stream; name=xml_1.out.bsdDownload
xml_1.out.gnuapplication/octet-stream; name=xml_1.out.gnuDownload
#26Pavel Stehule
pavel.stehule@gmail.com
In reply to: Thomas Munro (#25)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

po 17. 9. 2018 v 23:15 odesílatel Thomas Munro <
thomas.munro@enterprisedb.com> napsal:

On Mon, Sep 17, 2018 at 5:36 PM Pavel Stehule <pavel.stehule@gmail.com>
wrote:

po 17. 9. 2018 v 2:05 odesílatel Thomas Munro <

thomas.munro@enterprisedb.com> napsal:

On Fri, Aug 10, 2018 at 6:26 AM Andrew Dunstan
<andrew.dunstan@2ndquadrant.com> wrote:

On 01/24/2018 04:30 AM, Pavel Stehule wrote:

I am sending updated version.

Very much thanks for very precious review

Thomas,

I am unable to replicate the Linux failure seen in the cfbot on my
Fedora machine. Both when building with libxml2 and without, after
applying the latest patch the tests pass without error. Can you please
investigate what's going on here?

Well this is strange... I can't reproduce the problem either with or
without --with-libxml on a Debian box (was trying to get fairly close
to the OS that Travis runs on). But I see the same failure when I
apply the patch on my FreeBSD 12 laptop and test without
--with-libxml. Note that when cfbot runs it, the patch is applied
with FreeBSD patch, and then it's tested without --with-libxml on
Ubuntu (Travis's default OS). [Side note: I should change it to build
--with-libxml, but that's not the point.] So the common factor is a
different patch implementation. I wonder if a hunk is being
misinterpreted.

This patch is not too large. Please, can me send a related files, I can

check it manually.

I confirmed that xml_1.out is different depending on which 'patch' you
use. I've attached the output from FreeBSD patch 2.0-12u11 and GNU
patch 2.5.8. It's an interesting phenomenon, probably due to having a
huge long file with a lot of repeated text and a slightly different
algorithms or parameters, but I don't think you need to worry about it
for this. Sorry for the distraction.

ok. Thank you for info

Regards

Pavel

Show quoted text

--
Thomas Munro
http://www.enterprisedb.com

#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#19)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Pavel Stehule <pavel.stehule@gmail.com> writes:

[ xml-xpath-default-ns-7.patch ]

At Andrew's prompting, I took a look over this patch. I don't know much
of anything about XML, so I have no idea as to standards compliance here,
but I do have some comments:

* I'm fairly uncomfortable with the idea that we're going to maintain
our own XPath parser. That seems like a recipe for lots of future
work ... and the code is far too underdocumented for anyone to actually
maintain it. (Case in point: _transformXPath's arguments are not
documented at all, and in fact there's hardly a word about what it's
even supposed to *do*.)

* I think the business with "pgdefnamespace.pgsqlxml.internal" is just
plain awful. It's a wart, and I don't think it's even saving you any
code once you account for all the places where you have to throw errors
for somebody trying to use that as a regular name. This should be done
with out-of-band signaling if possible. The existing convention above
this code is that a NULL pointer means a default namespace; can't that
be continued throughout? If that's not practical, can you pick a string
that simply can't syntactically be a namespace name? (In the SQL world,
for example, an empty string is an illegal identifier so that that could
be used for the purpose. But I don't know if that applies to XML.)
Or perhaps you can build a random name that is chosen just to make it
different from any of the listed namespaces? If none of those work,
I think passing around an explicit "isdefault" boolean alongside the name
would be better than having a reserved word.

* _transformXPath recurses, so doesn't it need check_stack_depth()?

* I'm not especially in love with using function names that start
with an underscore. I do not think that adds anything, and it's
unlike the style in most places in PG.

* This is a completely unhelpful error message:
+ if (!parser->buffer_is_empty)
+ elog(ERROR, "internal error");
If you can't be bothered to write a message that says something
useful, either drop the test or turn it into an Assert. I see some
other internal errors that aren't up to project norms either.

* Either get rid of the "elog(DEBUG1)"s, or greatly lower their
message priority. They might've been appropriate for developing
this patch, but they are not okay to commit that way.

* Try to reduce the amount of random whitespace changes in the patch.

* .h files should never #include "postgres.h", per project policy.

* I'm not sure I'd bother with putting the new code into a separate
file rather than cramming it into xml.c. The main reason why not
is that you're going to get "empty translation unit" warnings from
some compilers in builds without USE_LIBXML.

* Documentation, comments, and error messages could all use some
copy-editing by a native English speaker (you knew that of course).

regards, tom lane

#28Pavel Stehule
pavel.stehule@gmail.com
In reply to: Tom Lane (#27)
1 attachment(s)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi

út 18. 9. 2018 v 17:33 odesílatel Tom Lane <tgl@sss.pgh.pa.us> napsal:

Pavel Stehule <pavel.stehule@gmail.com> writes:

[ xml-xpath-default-ns-7.patch ]

At Andrew's prompting, I took a look over this patch. I don't know much
of anything about XML, so I have no idea as to standards compliance here,
but I do have some comments:

* I'm fairly uncomfortable with the idea that we're going to maintain
our own XPath parser. That seems like a recipe for lots of future
work ... and the code is far too underdocumented for anyone to actually
maintain it. (Case in point: _transformXPath's arguments are not
documented at all, and in fact there's hardly a word about what it's
even supposed to *do*.)

I understand, and I would be much more happy if xmllib2 supports this
feature. But the development of new feature of this lib was frozen - and
there are not any possibility.

On second hand the parser is very simple - tokenizer detects identifiers,
strings, numbers, and parser try to find unqualified names and prints
default namespace identifier before. It doesn't do more.

I renamed function _transformXPath to transform_xpath_recurse and I
descibed params

* I think the business with "pgdefnamespace.pgsqlxml.internal" is just
plain awful. It's a wart, and I don't think it's even saving you any
code once you account for all the places where you have to throw errors
for somebody trying to use that as a regular name. This should be done
with out-of-band signaling if possible. The existing convention above
this code is that a NULL pointer means a default namespace; can't that
be continued throughout? If that's not practical, can you pick a string
that simply can't syntactically be a namespace name? (In the SQL world,
for example, an empty string is an illegal identifier so that that could
be used for the purpose. But I don't know if that applies to XML.)
Or perhaps you can build a random name that is chosen just to make it
different from any of the listed namespaces? If none of those work,
I think passing around an explicit "isdefault" boolean alongside the name
would be better than having a reserved word.

The string used like default namespace name should be valid XML namespace
name, because it is injected to XPath expression and it is passed to
libxml2 as one namespace name. libxml2 requires not null valid value.

I changed it and now the namespace name is generated.

* _transformXPath recurses, so doesn't it need check_stack_depth()?

fixed

* I'm not especially in love with using function names that start
with an underscore. I do not think that adds anything, and it's
unlike the style in most places in PG.

renamed

* This is a completely unhelpful error message:
+       if (!parser->buffer_is_empty)
+               elog(ERROR, "internal error");
If you can't be bothered to write a message that says something
useful, either drop the test or turn it into an Assert.  I see some
other internal errors that aren't up to project norms either.

use assert

* Either get rid of the "elog(DEBUG1)"s, or greatly lower their
message priority. They might've been appropriate for developing
this patch, but they are not okay to commit that way.

removed

* Try to reduce the amount of random whitespace changes in the patch.

The formatting was really strange, fixed

* .h files should never #include "postgres.h", per project policy.

I moved the code to xml.c like you propose

* I'm not sure I'd bother with putting the new code into a separate
file rather than cramming it into xml.c. The main reason why not
is that you're going to get "empty translation unit" warnings from
some compilers in builds without USE_LIBXML.

* Documentation, comments, and error messages could all use some
copy-editing by a native English speaker (you knew that of course).

I hope so some native speakers looks there.

Thank you for comments

Attached updated patch

Regards

Pavel

Show quoted text

regards, tom lane

Attachments:

default_namespace-20180921.patch.gzapplication/gzip; name=default_namespace-20180921.patch.gzDownload
#29Dmitry Dolgov
9erthalion6@gmail.com
In reply to: Pavel Stehule (#28)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

On Fri, Sep 21, 2018 at 1:30 PM Pavel Stehule <pavel.stehule@gmail.com> wrote:

Thank you for comments

Attached updated patch

Unfortunately, current version of the patch doesn't pass make check, something
is missing for xml tests. Could you please rebase it?

After that I hope someone from reviewers (Kyotaro?) can probably confirm if
it's still in a good shape. For now I'm moving it to the next CF.

#30Pavel Stehule
pavel.stehule@gmail.com
In reply to: Dmitry Dolgov (#29)
1 attachment(s)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi

čt 29. 11. 2018 v 14:44 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
napsal:

On Fri, Sep 21, 2018 at 1:30 PM Pavel Stehule <pavel.stehule@gmail.com>

wrote:

Thank you for comments

Attached updated patch

Unfortunately, current version of the patch doesn't pass make check,
something
is missing for xml tests. Could you please rebase it?

After that I hope someone from reviewers (Kyotaro?) can probably confirm if
it's still in a good shape. For now I'm moving it to the next CF.

here is rebased patch

Regards

Pavel

Attachments:

default-namespaces-20181130.patch.gzapplication/gzip; name=default-namespaces-20181130.patch.gzDownload
#31Kyotaro HORIGUCHI
horiguchi.kyotaro@lab.ntt.co.jp
In reply to: Pavel Stehule (#30)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hello.

At Fri, 30 Nov 2018 07:48:26 +0100, Pavel Stehule <pavel.stehule@gmail.com> wrote in <CAFj8pRD7Zg07t4NpZu09T4RgXz0bTvyYg2eMVoH+o_drNoiz6w@mail.gmail.com>

Hi

čt 29. 11. 2018 v 14:44 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
napsal:

On Fri, Sep 21, 2018 at 1:30 PM Pavel Stehule <pavel.stehule@gmail.com>

wrote:

Thank you for comments

Attached updated patch

Unfortunately, current version of the patch doesn't pass make check,
something
is missing for xml tests. Could you please rebase it?

After that I hope someone from reviewers (Kyotaro?) can probably confirm if
it's still in a good shape. For now I'm moving it to the next CF.

Sure. Sorry for coming late. I reconfirmed this.

The most significant change in this version is namespace name
generaton.

- We can remove strlen from mutate_name by mutating the string in
reverse order. We don't need to mutate it in a human-affinity
order. The name would be 1-letter in almost all cases.

Concretely, the order in my mind is the follows:

"" "a" "b" ..."z" "aa" "ba" "ca"... "za" "ab"..

- Might the 'propriety' correctly be 'properties'?

+ /* register namespace if all propriety are available */

- Is the "if" a mistake of "in"?

 +     * collect ns names if ResTarget format for possible usage
 +     * in getUniqNames function.

- I suppose the following should be like "register default
namespace definition if any".

+ /* get default namespace name when it is required */

Maybe the followings are not new. (Note that I'm not a naitive speaker.)

- I cannot read this. (I might be to blame..)

  + * default namespace for XPath expressions. Because there are not any API
  + * how to transform or access to parsed XPath expression we have to parse
  + * XPath here.

- This might need to explain "by what".

 + * Those functionalities are implemented with a simple XPath parser/
 + * preprocessor. This XPath parser transforms a XPath expression to another
 + * XPath expression that can be used by libxml2 XPath evaluation. It doesn't
 + * replace libxml2 XPath parser or libxml2 XPath expression evaluation.

- "add" -> "adds", "def_namespace_name" seems to need to be
replaced with something else.

 + * This transformation add def_namespace_name to any unqualified node name
 + * or attribute name of xpath expression.

(Sorry, I'll look further later.)

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

#32Pavel Stehule
pavel.stehule@gmail.com
In reply to: Kyotaro HORIGUCHI (#31)
1 attachment(s)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

pá 30. 11. 2018 v 9:26 odesílatel Kyotaro HORIGUCHI <
horiguchi.kyotaro@lab.ntt.co.jp> napsal:

Hello.

At Fri, 30 Nov 2018 07:48:26 +0100, Pavel Stehule <pavel.stehule@gmail.com>
wrote in <
CAFj8pRD7Zg07t4NpZu09T4RgXz0bTvyYg2eMVoH+o_drNoiz6w@mail.gmail.com>

Hi

čt 29. 11. 2018 v 14:44 odesílatel Dmitry Dolgov <9erthalion6@gmail.com>
napsal:

On Fri, Sep 21, 2018 at 1:30 PM Pavel Stehule <

pavel.stehule@gmail.com>

wrote:

Thank you for comments

Attached updated patch

Unfortunately, current version of the patch doesn't pass make check,
something
is missing for xml tests. Could you please rebase it?

After that I hope someone from reviewers (Kyotaro?) can probably

confirm if

it's still in a good shape. For now I'm moving it to the next CF.

Sure. Sorry for coming late. I reconfirmed this.

The most significant change in this version is namespace name
generaton.

- We can remove strlen from mutate_name by mutating the string in
reverse order. We don't need to mutate it in a human-affinity
order. The name would be 1-letter in almost all cases.

Concretely, the order in my mind is the follows:

"" "a" "b" ..."z" "aa" "ba" "ca"... "za" "ab"..

done

- Might the 'propriety' correctly be 'properties'?

+ /* register namespace if all propriety are available */

fixed

- Is the "if" a mistake of "in"?

+     * collect ns names if ResTarget format for possible usage
+     * in getUniqNames function.

yup, fixed

- I suppose the following should be like "register default
namespace definition if any".

+ /* get default namespace name when it is required */

fixed

Maybe the followings are not new. (Note that I'm not a naitive speaker.)

- I cannot read this. (I might be to blame..)

+ * default namespace for XPath expressions. Because there are not any
API
+ * how to transform or access to parsed XPath expression we have to
parse
+ * XPath here.

- This might need to explain "by what".

+ * Those functionalities are implemented with a simple XPath parser/
+ * preprocessor. This XPath parser transforms a XPath expression to
another
+ * XPath expression that can be used by libxml2 XPath evaluation. It
doesn't
+ * replace libxml2 XPath parser or libxml2 XPath expression evaluation.

- "add" -> "adds", "def_namespace_name" seems to need to be
replaced with something else.

+ * This transformation add def_namespace_name to any unqualified node
name
+ * or attribute name of xpath expression.

I tried to formulate it better, but I am sorry, my English is not good.

(Sorry, I'll look further later.)

I am sending a updated patch

Regards

Pavel

Show quoted text

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachments:

default-namespaces-20181130-2.patch.gzapplication/gzip; name=default-namespaces-20181130-2.patch.gzDownload
#33Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#27)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

Hi,

On 2018-09-18 11:33:38 -0400, Tom Lane wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

[ xml-xpath-default-ns-7.patch ]

At Andrew's prompting, I took a look over this patch. I don't know much
of anything about XML, so I have no idea as to standards compliance here,
but I do have some comments:

* I'm fairly uncomfortable with the idea that we're going to maintain
our own XPath parser. That seems like a recipe for lots of future
work ... and the code is far too underdocumented for anyone to actually
maintain it. (Case in point: _transformXPath's arguments are not
documented at all, and in fact there's hardly a word about what it's
even supposed to *do*.)

We were looking at this patch at the pgday developer meeting: Our
impression is that this patch should be rejected. If really desired, the
best approach seems to be actually implement this in libxml, and then go
from there.

Greetings,

Andres Freund

#34Pavel Stehule
pavel.stehule@gmail.com
In reply to: Andres Freund (#33)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

čt 31. 1. 2019 v 14:44 odesílatel Andres Freund <andres@anarazel.de> napsal:

Hi,

On 2018-09-18 11:33:38 -0400, Tom Lane wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

[ xml-xpath-default-ns-7.patch ]

At Andrew's prompting, I took a look over this patch. I don't know much
of anything about XML, so I have no idea as to standards compliance here,
but I do have some comments:

* I'm fairly uncomfortable with the idea that we're going to maintain
our own XPath parser. That seems like a recipe for lots of future
work ... and the code is far too underdocumented for anyone to actually
maintain it. (Case in point: _transformXPath's arguments are not
documented at all, and in fact there's hardly a word about what it's
even supposed to *do*.)

We were looking at this patch at the pgday developer meeting: Our
impression is that this patch should be rejected. If really desired, the
best approach seems to be actually implement this in libxml, and then go
from there.

Unfortunately, the development of libxml2 is frozen.

I have to accept it.

Regards

Pavel

Show quoted text

Greetings,

Andres Freund

#35Michael Paquier
michael@paquier.xyz
In reply to: Pavel Stehule (#34)
Re: [HACKERS] proposal - Default namespaces for XPath expressions (PostgreSQL 11)

On Thu, Jan 31, 2019 at 02:54:49PM +0100, Pavel Stehule wrote:

Unfortunately, the development of libxml2 is frozen.

I have to accept it.

And marked as rejected, based on the last consensus.
--
Michael