xpath_array with namespaces support

Started by Nikolay Samokhvalovalmost 19 years ago24 messages
#1Nikolay Samokhvalov
samokhvalov@gmail.com
1 attachment(s)

As a result of discussion with Peter, I provide modified patch for
xpath_array() with namespaces support.

The signature is:
_xml xpath_array(text xpathQuery, xml xmlValue[, _text namespacesBindings])

The third argument is 2-dimensional array defining bindings for
namespaces. Simple examples:

xmltest=# SELECT xpath_array('//text()', '<local:data
xmlns:local="http://127.0.0.1&quot;&gt;&lt;local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>');
xpath_array
----------------
{"number one"}
(1 row)

xmltest=# SELECT xpath_array('//loc:piece/@id', '<local:data
xmlns:local="http://127.0.0.1&quot;&gt;&lt;local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>',
ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1&#39;]]);
xpath_array
-------------
{1,2}
(1 row)

Thoughts regarding other XPath functions were exposed a couple of days
ago: http://archives.postgresql.org/pgsql-patches/2007-02/msg00373.php

If there is no objections, we could call the function provided in this
patch as xpath() or xmlpath() (the latter is similar to SQL/XML
functions).

Also, maybe someone can suggest better approach for passing namespace
bindings (more convenient than ARRAY[ARRAY[...], ARRAY[...]])?

--
Best regards,
Nikolay

Attachments:

xpath.w.namespaces.20070220.patchtext/x-patch; charset=ANSI_X3.4-1968; name=xpath.w.namespaces.20070220.patchDownload
Index: src/backend/utils/adt/xml.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.31
diff -u -r1.31 xml.c
--- src/backend/utils/adt/xml.c	16 Feb 2007 18:37:43 -0000	1.31
+++ src/backend/utils/adt/xml.c	20 Feb 2007 23:20:54 -0000
@@ -47,6 +47,8 @@
 #include <libxml/uri.h>
 #include <libxml/xmlerror.h>
 #include <libxml/xmlwriter.h>
+#include <libxml/xpath.h>
+#include <libxml/xpathInternals.h>
 #endif /* USE_LIBXML */
 
 #include "catalog/namespace.h"
@@ -65,6 +67,7 @@
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "access/tupmacs.h"
 #include "utils/xml.h"
 
 
@@ -86,6 +89,7 @@
 static int		parse_xml_decl(const xmlChar *str, size_t *lenp, xmlChar **version, xmlChar **encoding, int *standalone);
 static bool		print_xml_decl(StringInfo buf, const xmlChar *version, pg_enc encoding, int standalone);
 static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace, xmlChar *encoding);
+static text		*xml_xmlnodetotext(xmlNodePtr cur);
 
 #endif /* USE_LIBXML */
 
@@ -1463,7 +1467,6 @@
 	return buf.data;
 }
 
-
 /*
  * Map SQL value to XML value; see SQL/XML:2003 section 9.16.
  */
@@ -2334,3 +2337,238 @@
 	else
 		appendStringInfoString(result, "</row>\n\n");
 }
+
+
+/*
+ * XPath related functions
+ */
+
+/* 
+ * Convert XML node to text (return only value, it's not dumping)
+ */
+text *
+xml_xmlnodetotext(xmlNodePtr cur)
+{
+	xmlChar    		*str;
+	text			*result;
+	size_t			len;	
+	
+	str = xmlXPathCastNodeToString(cur);
+	len = strlen((char *) str);
+	result = (text *) palloc(len + VARHDRSZ);
+	VARATT_SIZEP(result) = len + VARHDRSZ;
+	memcpy(VARDATA(result), str, len);
+	
+	return result;
+}
+
+/*
+ * Evaluate XPath expression and return array of XML values.
+ * As we have no support of XQuery sequences yet, this functions seems
+ * to be the most useful one (array of XML functions plays a role of
+ * some kind of substritution for XQuery sequences).
+
+ * Workaround here: we parse XML data in different way to allow XPath for
+ * fragments (see "XPath for fragment" TODO comment inside).
+ */
+Datum
+xpath_array(PG_FUNCTION_ARGS)
+{
+#ifdef USE_LIBXML
+	ArrayBuildState		*astate = NULL;
+	xmlParserCtxtPtr	ctxt = NULL;
+	xmlDocPtr			doc = NULL;
+	xmlXPathContextPtr	xpathctx = NULL;
+	xmlXPathCompExprPtr	xpathcomp = NULL;
+	xmlXPathObjectPtr	xpathobj = NULL;
+	int32				len, xpath_len;
+	xmlChar				*string, *xpath_expr;
+	bool				res_is_null = FALSE;
+	int					i;
+	xmltype				*data  = PG_GETARG_XML_P(1);
+	text				*xpath_expr_text = PG_GETARG_TEXT_P(0);
+	ArrayType			*namespaces;
+	int					*dims, ndims, ns_count = 0, bitmask = 1;
+	char				*ptr;
+	bits8				*bitmap;
+	char				**ns_names = NULL, **ns_uris = NULL;
+	int16				typlen;
+	bool				typbyval;
+	char				typalign;
+	
+	/* Namespace mappings passed as text[].
+	 * Assume that 2-dimensional array has been passed, 
+	 * the 1st subarray is array of names, the 2nd -- array of URIs,
+	 * example: ARRAY[ARRAY['myns', 'myns2'], ARRAY['http://example.com', 'http://example2.com']]. 
+	 */
+	if (!PG_ARGISNULL(2))
+	{
+		namespaces = PG_GETARG_ARRAYTYPE_P(2);
+		ndims = ARR_NDIM(namespaces);
+		dims = ARR_DIMS(namespaces);
+		
+		/* Sanity check */
+		if (ndims != 2)
+			ereport(ERROR, (errmsg("invalid array passed for namespace mappings"),
+							errdetail("Only 2-dimensional array may be used for namespace mappings.")));
+		
+		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+		
+		ns_count = ArrayGetNItems(ndims, dims) / 2;
+		get_typlenbyvalalign(ARR_ELEMTYPE(namespaces),
+							 &typlen, &typbyval, &typalign);
+		ns_names = (char **) palloc(ns_count * sizeof(char *));
+		ns_uris = (char **) palloc(ns_count * sizeof(char *));
+		ptr = ARR_DATA_PTR(namespaces);
+		bitmap = ARR_NULLBITMAP(namespaces);
+		bitmask = 1;
+		
+		for (i = 0; i < ns_count * 2; i++)
+		{
+			if (bitmap && (*bitmap & bitmask) == 0)
+				ereport(ERROR, (errmsg("neither namespace nor URI may be NULL"))); /* TODO: better message */
+			else
+			{
+				if (i < ns_count)
+					ns_names[i] = DatumGetCString(DirectFunctionCall1(textout,
+														  PointerGetDatum(ptr)));
+				else
+					ns_uris[i - ns_count] = DatumGetCString(DirectFunctionCall1(textout,
+														  PointerGetDatum(ptr)));
+				ptr = att_addlength(ptr, typlen, PointerGetDatum(ptr));
+				ptr = (char *) att_align(ptr, typalign);
+			}
+	
+			/* advance bitmap pointer if any */
+			if (bitmap)
+			{
+				bitmask <<= 1;
+				if (bitmask == 0x100)
+				{
+					bitmap++;
+					bitmask = 1;
+				}
+			}
+		}
+	}
+	
+	len = VARSIZE(data) - VARHDRSZ;
+	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+	if (xpath_len == 0)
+		ereport(ERROR, (errmsg("empty XPath expression")));
+	
+	if (xmlStrncmp((xmlChar *) VARDATA(data), (xmlChar *) "<?xml", 5) == 0)
+	{
+		string = palloc(len + 1);
+		memcpy(string, VARDATA(data), len);
+		string[len] = '\0';
+		xpath_expr = palloc(xpath_len + 1);
+		memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+		xpath_expr[xpath_len] = '\0';
+	}
+	else
+	{
+		/* use "<x>...</x>" as dummy root element to enable XPath for fragments */
+		/* TODO: (XPath for fragment) find better solution to work with XML fragment! */
+		string = xmlStrncatNew((xmlChar *) "<x>", (xmlChar *) VARDATA(data), len);
+		string = xmlStrncat(string, (xmlChar *) "</x>", 5);
+		len += 7;
+		xpath_expr = xmlStrncatNew((xmlChar *) "/x", (xmlChar *) VARDATA(xpath_expr_text), xpath_len);
+		len += 2;
+	}
+	
+	xml_init();
+
+	PG_TRY();
+	{
+		/* redundant XML parsing (two parsings for the same value in the same session are possible) */
+		ctxt = xmlNewParserCtxt();
+		if (ctxt == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not allocate parser context");
+		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+		if (doc == NULL)
+			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+						"could not parse XML data");
+		xpathctx = xmlXPathNewContext(doc);
+		if (xpathctx == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not allocate XPath context");
+		xpathctx->node = xmlDocGetRootElement(doc);
+		if (xpathctx->node == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not find root XML element"); 
+
+		/* register namespaces, if any */
+		if ((ns_count > 0) && ns_names && ns_uris)
+			for (i = 0; i < ns_count; i++)
+				if (0 != xmlXPathRegisterNs(xpathctx, (xmlChar *) ns_names[i], (xmlChar *) ns_uris[i]))
+					ereport(ERROR, 
+						(errmsg("could not register XML namespace with prefix=\"%s\" and href=\"%s\"", ns_names[i], ns_uris[i])));
+		
+		xpathcomp = xmlXPathCompile(xpath_expr);
+		if (xpathcomp == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"invalid XPath expression"); /* TODO: show proper XPath error details */
+		
+		xpathobj = xmlXPathCompiledEval(xpathcomp, xpathctx);
+		xmlXPathFreeCompExpr(xpathcomp);
+		if (xpathobj == NULL)
+			ereport(ERROR, (errmsg("could not create XPath object")));
+		
+		if (xpathobj->nodesetval == NULL)
+			res_is_null = TRUE;
+		
+		if (!res_is_null && xpathobj->nodesetval->nodeNr == 0)
+			/* TODO maybe empty array should be here, not NULL? (if so -- fix segfault) */
+			/*PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));*/
+			res_is_null = TRUE;
+		
+		if (!res_is_null) 
+			for (i = 0; i < xpathobj->nodesetval->nodeNr; i++)
+			{
+				Datum		elem;
+				bool		elemisnull = false;
+				elem = PointerGetDatum(xml_xmlnodetotext(xpathobj->nodesetval->nodeTab[i]));
+				astate = accumArrayResult(astate, elem,
+										  elemisnull, XMLOID,
+										  CurrentMemoryContext);
+			}
+		
+		xmlXPathFreeObject(xpathobj);
+		xmlXPathFreeContext(xpathctx);
+		xmlFreeParserCtxt(ctxt);
+		xmlFreeDoc(doc);
+		xmlCleanupParser();
+	}
+	PG_CATCH();
+	{
+		if (xpathcomp)
+			xmlXPathFreeCompExpr(xpathcomp);
+		if (xpathobj)
+			xmlXPathFreeObject(xpathobj);
+		if (xpathctx)
+			xmlXPathFreeContext(xpathctx);
+		if (doc)
+			xmlFreeDoc(doc);
+		if (ctxt)
+			xmlFreeParserCtxt(ctxt);
+		xmlCleanupParser();
+
+		PG_RE_THROW();
+	}
+	PG_END_TRY();
+	
+	if (res_is_null)
+	{
+		PG_RETURN_NULL();
+	}
+	else
+	{
+		PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));
+	}
+#else
+	NO_XML_SUPPORT();
+	return 0;
+#endif
+}
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.446
diff -u -r1.446 pg_proc.h
--- src/include/catalog/pg_proc.h	20 Feb 2007 10:00:25 -0000	1.446
+++ src/include/catalog/pg_proc.h	20 Feb 2007 23:20:57 -0000
@@ -4071,6 +4071,10 @@
 DATA(insert OID = 2930 (  query_to_xml_and_xmlschema  PGNSP PGUID 12 100 0 f f t f s 4 142 "25 16 16 25" _null_ _null_ "{query,nulls,tableforest,targetns}" query_to_xml_and_xmlschema - _null_ ));
 DESCR("map query result and structure to XML and XML Schema");
 
+DATA(insert OID = 2931 (  xpath_array      PGNSP PGUID 12 1 0 f f t f i 2 143 "25 142" _null_ _null_ _null_ xpath_array - _null_ ));
+DESCR("evaluate XPath expression");
+DATA(insert OID = 2932 (  xpath_array      PGNSP PGUID 12 1 0 f f t f i 3 143 "25 142 1009" _null_ _null_ _null_ xpath_array - _null_ ));
+DESCR("evaluate XPath expression, with namespaces support");
 
 /* uuid */ 
 DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 f f t f i 1 2950 "2275" _null_ _null_ _null_ uuid_in - _null_ ));
Index: src/include/utils/xml.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/xml.h,v
retrieving revision 1.16
diff -u -r1.16 xml.h
--- src/include/utils/xml.h	16 Feb 2007 07:46:55 -0000	1.16
+++ src/include/utils/xml.h	20 Feb 2007 23:20:57 -0000
@@ -36,6 +36,7 @@
 extern Datum texttoxml(PG_FUNCTION_ARGS);
 extern Datum xmltotext(PG_FUNCTION_ARGS);
 extern Datum xmlvalidate(PG_FUNCTION_ARGS);
+extern Datum xpath_array(PG_FUNCTION_ARGS);
 
 extern Datum table_to_xml(PG_FUNCTION_ARGS);
 extern Datum query_to_xml(PG_FUNCTION_ARGS);
#2Bruce Momjian
bruce@momjian.us
In reply to: Nikolay Samokhvalov (#1)
Re: xpath_array with namespaces support

Your patch has been added to the PostgreSQL unapplied patches list at:

http://momjian.postgresql.org/cgi-bin/pgpatches

It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.

---------------------------------------------------------------------------

Nikolay Samokhvalov wrote:

As a result of discussion with Peter, I provide modified patch for
xpath_array() with namespaces support.

The signature is:
_xml xpath_array(text xpathQuery, xml xmlValue[, _text namespacesBindings])

The third argument is 2-dimensional array defining bindings for
namespaces. Simple examples:

xmltest=# SELECT xpath_array('//text()', '<local:data
xmlns:local="http://127.0.0.1&quot;&gt;&lt;local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>');
xpath_array
----------------
{"number one"}
(1 row)

xmltest=# SELECT xpath_array('//loc:piece/@id', '<local:data
xmlns:local="http://127.0.0.1&quot;&gt;&lt;local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>',
ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1&#39;]]);
xpath_array
-------------
{1,2}
(1 row)

Thoughts regarding other XPath functions were exposed a couple of days
ago: http://archives.postgresql.org/pgsql-patches/2007-02/msg00373.php

If there is no objections, we could call the function provided in this
patch as xpath() or xmlpath() (the latter is similar to SQL/XML
functions).

Also, maybe someone can suggest better approach for passing namespace
bindings (more convenient than ARRAY[ARRAY[...], ARRAY[...]])?

--
Best regards,
Nikolay

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#3Bruce Momjian
bruce@momjian.us
In reply to: Nikolay Samokhvalov (#1)
Re: xpath_array with namespaces support

I tried this patch bug found this regression failure:

-- Considering only built-in procs (prolang = 12), look for multiple uses
-- of the same internal function (ie, matching prosrc fields). It's OK to
-- have several entries with different pronames for the same internal function,
-- but conflicts in the number of arguments and other critical items should
-- be complained of. (We don't check data types here; see next query.)
-- Note: ignore aggregate functions here, since they all point to the same
-- dummy built-in function.
SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
(p1.proisagg = false OR p2.proisagg = false) AND
(p1.prolang != p2.prolang OR
p1.proisagg != p2.proisagg OR
p1.prosecdef != p2.prosecdef OR
p1.proisstrict != p2.proisstrict OR
p1.proretset != p2.proretset OR
p1.provolatile != p2.provolatile OR
p1.pronargs != p2.pronargs);
oid | proname | oid | proname
------+-------------+------+-------------
2931 | xpath_array | 2932 | xpath_array
(1 row)

This is because you are calling xpath_array with 2 and 3 arguments.
Seems we don't do this anywhere else.

I also had to add a #ifdef USE_LIBXML around xml_xmlnodetotext(). Please
research a fix to this an resubmit. Thanks.

---------------------------------------------------------------------------

Nikolay Samokhvalov wrote:

As a result of discussion with Peter, I provide modified patch for
xpath_array() with namespaces support.

The signature is:
_xml xpath_array(text xpathQuery, xml xmlValue[, _text namespacesBindings])

The third argument is 2-dimensional array defining bindings for
namespaces. Simple examples:

xmltest=# SELECT xpath_array('//text()', '<local:data
xmlns:local="http://127.0.0.1&quot;&gt;&lt;local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>');
xpath_array
----------------
{"number one"}
(1 row)

xmltest=# SELECT xpath_array('//loc:piece/@id', '<local:data
xmlns:local="http://127.0.0.1&quot;&gt;&lt;local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>',
ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1&#39;]]);
xpath_array
-------------
{1,2}
(1 row)

Thoughts regarding other XPath functions were exposed a couple of days
ago: http://archives.postgresql.org/pgsql-patches/2007-02/msg00373.php

If there is no objections, we could call the function provided in this
patch as xpath() or xmlpath() (the latter is similar to SQL/XML
functions).

Also, maybe someone can suggest better approach for passing namespace
bindings (more convenient than ARRAY[ARRAY[...], ARRAY[...]])?

--
Best regards,
Nikolay

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#4Nikolay Samokhvalov
nikolay@samokhvalov.com
In reply to: Bruce Momjian (#3)
Re: xpath_array with namespaces support

On 3/3/07, Bruce Momjian <bruce@momjian.us> wrote:

I tried this patch bug found this regression failure:

-- Considering only built-in procs (prolang = 12), look for multiple uses
-- of the same internal function (ie, matching prosrc fields). It's OK to
-- have several entries with different pronames for the same internal function,
-- but conflicts in the number of arguments and other critical items should
-- be complained of. (We don't check data types here; see next query.)
-- Note: ignore aggregate functions here, since they all point to the same
-- dummy built-in function.
SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
(p1.proisagg = false OR p2.proisagg = false) AND
(p1.prolang != p2.prolang OR
p1.proisagg != p2.proisagg OR
p1.prosecdef != p2.prosecdef OR
p1.proisstrict != p2.proisstrict OR
p1.proretset != p2.proretset OR
p1.provolatile != p2.provolatile OR
p1.pronargs != p2.pronargs);
oid | proname | oid | proname
------+-------------+------+-------------
2931 | xpath_array | 2932 | xpath_array
(1 row)

This is because you are calling xpath_array with 2 and 3 arguments.
Seems we don't do this anywhere else.

I also had to add a #ifdef USE_LIBXML around xml_xmlnodetotext(). Please
research a fix to this an resubmit. Thanks.

OK.
I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.

--
Best regards,
Nikolay

#5Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Nikolay Samokhvalov (#4)
1 attachment(s)
Re: xpath_array with namespaces support

On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:

I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.

Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().

--
Best regards,
Nikolay

Attachments:

xpath.w.namespaces.20070304.patchtext/x-patch; charset=ANSI_X3.4-1968; name=xpath.w.namespaces.20070304.patchDownload
Index: src/backend/utils/adt/xml.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.34
diff -u -r1.34 xml.c
--- src/backend/utils/adt/xml.c	3 Mar 2007 19:32:55 -0000	1.34
+++ src/backend/utils/adt/xml.c	5 Mar 2007 01:14:57 -0000
@@ -47,6 +47,8 @@
 #include <libxml/uri.h>
 #include <libxml/xmlerror.h>
 #include <libxml/xmlwriter.h>
+#include <libxml/xpath.h>
+#include <libxml/xpathInternals.h>
 #endif /* USE_LIBXML */
 
 #include "catalog/namespace.h"
@@ -67,6 +69,7 @@
 #include "utils/datetime.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "access/tupmacs.h"
 #include "utils/xml.h"
 
 
@@ -88,6 +91,7 @@
 static int		parse_xml_decl(const xmlChar *str, size_t *lenp, xmlChar **version, xmlChar **encoding, int *standalone);
 static bool		print_xml_decl(StringInfo buf, const xmlChar *version, pg_enc encoding, int standalone);
 static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace, xmlChar *encoding);
+static text		*xml_xmlnodetotext(xmlNodePtr cur);
 
 #endif /* USE_LIBXML */
 
@@ -1463,7 +1467,6 @@
 	return buf.data;
 }
 
-
 /*
  * Map SQL value to XML value; see SQL/XML:2003 section 9.16.
  */
@@ -2403,3 +2406,247 @@
 	else
 		appendStringInfoString(result, "</row>\n\n");
 }
+
+
+/*
+ * XPath related functions
+ */
+
+#ifdef USE_LIBXML
+/* 
+ * Convert XML node to text (return value only, it's not dumping)
+ */
+text *
+xml_xmlnodetotext(xmlNodePtr cur)
+{
+	xmlChar    		*str;
+	text			*result;
+	size_t			len;	
+	
+	str = xmlXPathCastNodeToString(cur);
+	len = strlen((char *) str);
+	result = (text *) palloc(len + VARHDRSZ);
+	SET_VARSIZE(result, len + VARHDRSZ);
+	memcpy(VARDATA(result), str, len);
+	
+	return result;
+}
+#endif
+
+/*
+ * Evaluate XPath expression and return array of XML values.
+ * As we have no support of XQuery sequences yet, this functions seems
+ * to be the most useful one (array of XML functions plays a role of
+ * some kind of substritution for XQuery sequences).
+
+ * Workaround here: we parse XML data in different way to allow XPath for
+ * fragments (see "XPath for fragment" TODO comment inside).
+ */
+Datum
+xmlpath(PG_FUNCTION_ARGS)
+{
+#ifdef USE_LIBXML
+	ArrayBuildState		*astate = NULL;
+	xmlParserCtxtPtr	ctxt = NULL;
+	xmlDocPtr			doc = NULL;
+	xmlXPathContextPtr	xpathctx = NULL;
+	xmlXPathCompExprPtr	xpathcomp = NULL;
+	xmlXPathObjectPtr	xpathobj = NULL;
+	int32				len, xpath_len;
+	xmlChar				*string, *xpath_expr;
+	bool				res_is_null = FALSE;
+	int					i;
+	xmltype				*data;
+	text				*xpath_expr_text;
+	ArrayType			*namespaces;
+	int					*dims, ndims, ns_count = 0, bitmask = 1;
+	char				*ptr;
+	bits8				*bitmap;
+	char				**ns_names = NULL, **ns_uris = NULL;
+	int16				typlen;
+	bool				typbyval;
+	char				typalign;
+	
+	/* the function is not strict, we must check first two args */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1))
+		PG_RETURN_NULL();
+	
+	xpath_expr_text = PG_GETARG_TEXT_P(0);
+	data  = PG_GETARG_XML_P(1);
+	
+	/* Namespace mappings passed as text[].
+	 * Assume that 2-dimensional array has been passed, 
+	 * the 1st subarray is array of names, the 2nd -- array of URIs,
+	 * example: ARRAY[ARRAY['myns', 'myns2'], ARRAY['http://example.com', 'http://example2.com']]. 
+	 */
+	if (!PG_ARGISNULL(2))
+	{
+		namespaces = PG_GETARG_ARRAYTYPE_P(2);
+		ndims = ARR_NDIM(namespaces);
+		dims = ARR_DIMS(namespaces);
+		
+		/* Sanity check */
+		if (ndims != 2)
+			ereport(ERROR, (errmsg("invalid array passed for namespace mappings"),
+							errdetail("Only 2-dimensional array may be used for namespace mappings.")));
+		
+		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+		
+		ns_count = ArrayGetNItems(ndims, dims) / 2;
+		get_typlenbyvalalign(ARR_ELEMTYPE(namespaces),
+							 &typlen, &typbyval, &typalign);
+		ns_names = (char **) palloc(ns_count * sizeof(char *));
+		ns_uris = (char **) palloc(ns_count * sizeof(char *));
+		ptr = ARR_DATA_PTR(namespaces);
+		bitmap = ARR_NULLBITMAP(namespaces);
+		bitmask = 1;
+		
+		for (i = 0; i < ns_count * 2; i++)
+		{
+			if (bitmap && (*bitmap & bitmask) == 0)
+				ereport(ERROR, (errmsg("neither namespace nor URI may be NULL"))); /* TODO: better message */
+			else
+			{
+				if (i < ns_count)
+					ns_names[i] = DatumGetCString(DirectFunctionCall1(textout,
+														  PointerGetDatum(ptr)));
+				else
+					ns_uris[i - ns_count] = DatumGetCString(DirectFunctionCall1(textout,
+														  PointerGetDatum(ptr)));
+				ptr = att_addlength(ptr, typlen, PointerGetDatum(ptr));
+				ptr = (char *) att_align(ptr, typalign);
+			}
+	
+			/* advance bitmap pointer if any */
+			if (bitmap)
+			{
+				bitmask <<= 1;
+				if (bitmask == 0x100)
+				{
+					bitmap++;
+					bitmask = 1;
+				}
+			}
+		}
+	}
+	
+	len = VARSIZE(data) - VARHDRSZ;
+	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+	if (xpath_len == 0)
+		ereport(ERROR, (errmsg("empty XPath expression")));
+	
+	if (xmlStrncmp((xmlChar *) VARDATA(data), (xmlChar *) "<?xml", 5) == 0)
+	{
+		string = palloc(len + 1);
+		memcpy(string, VARDATA(data), len);
+		string[len] = '\0';
+		xpath_expr = palloc(xpath_len + 1);
+		memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+		xpath_expr[xpath_len] = '\0';
+	}
+	else
+	{
+		/* use "<x>...</x>" as dummy root element to enable XPath for fragments */
+		/* TODO: (XPath for fragment) find better solution to work with XML fragment! */
+		string = xmlStrncatNew((xmlChar *) "<x>", (xmlChar *) VARDATA(data), len);
+		string = xmlStrncat(string, (xmlChar *) "</x>", 5);
+		len += 7;
+		xpath_expr = xmlStrncatNew((xmlChar *) "/x", (xmlChar *) VARDATA(xpath_expr_text), xpath_len);
+		len += 2;
+	}
+	
+	xml_init();
+
+	PG_TRY();
+	{
+		/* redundant XML parsing (two parsings for the same value in the same session are possible) */
+		ctxt = xmlNewParserCtxt();
+		if (ctxt == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not allocate parser context");
+		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+		if (doc == NULL)
+			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+						"could not parse XML data");
+		xpathctx = xmlXPathNewContext(doc);
+		if (xpathctx == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not allocate XPath context");
+		xpathctx->node = xmlDocGetRootElement(doc);
+		if (xpathctx->node == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not find root XML element"); 
+
+		/* register namespaces, if any */
+		if ((ns_count > 0) && ns_names && ns_uris)
+			for (i = 0; i < ns_count; i++)
+				if (0 != xmlXPathRegisterNs(xpathctx, (xmlChar *) ns_names[i], (xmlChar *) ns_uris[i]))
+					ereport(ERROR, 
+						(errmsg("could not register XML namespace with prefix=\"%s\" and href=\"%s\"", ns_names[i], ns_uris[i])));
+		
+		xpathcomp = xmlXPathCompile(xpath_expr);
+		if (xpathcomp == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"invalid XPath expression"); /* TODO: show proper XPath error details */
+		
+		xpathobj = xmlXPathCompiledEval(xpathcomp, xpathctx);
+		xmlXPathFreeCompExpr(xpathcomp);
+		if (xpathobj == NULL)
+			ereport(ERROR, (errmsg("could not create XPath object")));
+		
+		if (xpathobj->nodesetval == NULL)
+			res_is_null = TRUE;
+		
+		if (!res_is_null && xpathobj->nodesetval->nodeNr == 0)
+			/* TODO maybe empty array should be here, not NULL? (if so -- fix segfault) */
+			/*PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));*/
+			res_is_null = TRUE;
+		
+		if (!res_is_null) 
+			for (i = 0; i < xpathobj->nodesetval->nodeNr; i++)
+			{
+				Datum		elem;
+				bool		elemisnull = false;
+				elem = PointerGetDatum(xml_xmlnodetotext(xpathobj->nodesetval->nodeTab[i]));
+				astate = accumArrayResult(astate, elem,
+										  elemisnull, XMLOID,
+										  CurrentMemoryContext);
+			}
+		
+		xmlXPathFreeObject(xpathobj);
+		xmlXPathFreeContext(xpathctx);
+		xmlFreeParserCtxt(ctxt);
+		xmlFreeDoc(doc);
+		xmlCleanupParser();
+	}
+	PG_CATCH();
+	{
+		if (xpathcomp)
+			xmlXPathFreeCompExpr(xpathcomp);
+		if (xpathobj)
+			xmlXPathFreeObject(xpathobj);
+		if (xpathctx)
+			xmlXPathFreeContext(xpathctx);
+		if (doc)
+			xmlFreeDoc(doc);
+		if (ctxt)
+			xmlFreeParserCtxt(ctxt);
+		xmlCleanupParser();
+
+		PG_RE_THROW();
+	}
+	PG_END_TRY();
+	
+	if (res_is_null)
+	{
+		PG_RETURN_NULL();
+	}
+	else
+	{
+		PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));
+	}
+#else
+	NO_XML_SUPPORT();
+	return 0;
+#endif
+}
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.447
diff -u -r1.447 pg_proc.h
--- src/include/catalog/pg_proc.h	3 Mar 2007 19:52:46 -0000	1.447
+++ src/include/catalog/pg_proc.h	5 Mar 2007 01:14:57 -0000
@@ -4073,6 +4073,10 @@
 DATA(insert OID = 2930 (  query_to_xml_and_xmlschema  PGNSP PGUID 12 100 0 f f t f s 4 142 "25 16 16 25" _null_ _null_ "{query,nulls,tableforest,targetns}" query_to_xml_and_xmlschema - _null_ ));
 DESCR("map query result and structure to XML and XML Schema");
 
+DATA(insert OID = 2931 (  xmlpath      PGNSP PGUID 12 1 0 f f f f i 3 143 "25 142 1009" _null_ _null_ _null_ xmlpath - _null_ ));
+DESCR("evaluate XPath expression, with namespaces support");
+DATA(insert OID = 2932 (  xmlpath      PGNSP PGUID 14 1 0 f f f f i 2 143 "25 142" _null_ _null_ _null_ "select pg_catalog.xmlpath($1, $2, NULL)" - _null_ ));
+DESCR("evaluate XPath expression");
 
 /* uuid */ 
 DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 f f t f i 1 2950 "2275" _null_ _null_ _null_ uuid_in - _null_ ));
Index: src/test/regress/expected/xml_1.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/xml_1.out,v
retrieving revision 1.13
diff -u -r1.13 xml_1.out
--- src/test/regress/expected/xml_1.out	15 Feb 2007 05:05:03 -0000	1.13
+++ src/test/regress/expected/xml_1.out	5 Mar 2007 01:14:58 -0000
@@ -197,3 +197,18 @@
  xmlview5   | SELECT XMLPARSE(CONTENT '<abc>x</abc>'::text STRIP WHITESPACE) AS "xmlparse";
 (2 rows)
 
+-- Text XPath expressions evaluation
+SELECT xmlpath('/value', data) FROM xmltest;
+ xmlpath 
+---------
+(0 rows)
+
+SELECT xmlpath(NULL, NULL) IS NULL FROM xmltest;
+ERROR:  no XML support in this installation
+CONTEXT:  SQL function "xmlpath" statement 1
+SELECT xmlpath('', '<!-- error -->');
+ERROR:  no XML support in this installation
+SELECT xmlpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
+ERROR:  no XML support in this installation
+SELECT xmlpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
+ERROR:  no XML support in this installation
Index: src/test/regress/expected/xml.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/xml.out,v
retrieving revision 1.15
diff -u -r1.15 xml.out
--- src/test/regress/expected/xml.out	15 Feb 2007 05:05:03 -0000	1.15
+++ src/test/regress/expected/xml.out	5 Mar 2007 01:14:58 -0000
@@ -401,3 +401,33 @@
  xmlview9   | SELECT XMLSERIALIZE(CONTENT 'good'::"xml" AS text) AS "xmlserialize";
 (9 rows)
 
+-- Text XPath expressions evaluation
+SELECT xmlpath('/value', data) FROM xmltest;
+ xmlpath 
+---------
+ {one}
+ {two}
+(2 rows)
+
+SELECT xmlpath(NULL, NULL) IS NULL FROM xmltest;
+ ?column? 
+----------
+ t
+ t
+(2 rows)
+
+SELECT xmlpath('', '<!-- error -->');
+ERROR:  empty XPath expression
+CONTEXT:  SQL function "xmlpath" statement 1
+SELECT xmlpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
+    xmlpath     
+----------------
+ {"number one"}
+(1 row)
+
+SELECT xmlpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
+ xmlpath 
+---------
+ {1,2}
+(1 row)
+
Index: src/include/utils/xml.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/xml.h,v
retrieving revision 1.16
diff -u -r1.16 xml.h
--- src/include/utils/xml.h	16 Feb 2007 07:46:55 -0000	1.16
+++ src/include/utils/xml.h	5 Mar 2007 01:14:57 -0000
@@ -36,6 +36,7 @@
 extern Datum texttoxml(PG_FUNCTION_ARGS);
 extern Datum xmltotext(PG_FUNCTION_ARGS);
 extern Datum xmlvalidate(PG_FUNCTION_ARGS);
+extern Datum xmlpath(PG_FUNCTION_ARGS);
 
 extern Datum table_to_xml(PG_FUNCTION_ARGS);
 extern Datum query_to_xml(PG_FUNCTION_ARGS);
Index: src/test/regress/sql/xml.sql
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/sql/xml.sql,v
retrieving revision 1.12
diff -u -r1.12 xml.sql
--- src/test/regress/sql/xml.sql	15 Feb 2007 05:05:03 -0000	1.12
+++ src/test/regress/sql/xml.sql	5 Mar 2007 01:14:58 -0000
@@ -144,3 +144,11 @@
 
 SELECT table_name, view_definition FROM information_schema.views
   WHERE table_name LIKE 'xmlview%' ORDER BY 1;
+
+-- Text XPath expressions evaluation
+
+SELECT xmlpath('/value', data) FROM xmltest;
+SELECT xmlpath(NULL, NULL) IS NULL FROM xmltest;
+SELECT xmlpath('', '<!-- error -->');
+SELECT xmlpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
+SELECT xmlpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
#6Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Nikolay Samokhvalov (#5)
1 attachment(s)
Re: [PATCHES] xpath_array with namespaces support

What about it? W/o this not large patch XML functionality in 8.3 will be weak...
Will it be accepted?

On 3/5/07, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:

On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:

I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.

Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().

--
Best regards,
Nikolay

Attachments:

xpath.w.namespaces.20070304.patchtext/x-patch; name=xpath.w.namespaces.20070304.patchDownload
Index: src/backend/utils/adt/xml.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.34
diff -u -r1.34 xml.c
--- src/backend/utils/adt/xml.c	3 Mar 2007 19:32:55 -0000	1.34
+++ src/backend/utils/adt/xml.c	5 Mar 2007 01:14:57 -0000
@@ -47,6 +47,8 @@
 #include <libxml/uri.h>
 #include <libxml/xmlerror.h>
 #include <libxml/xmlwriter.h>
+#include <libxml/xpath.h>
+#include <libxml/xpathInternals.h>
 #endif /* USE_LIBXML */
 
 #include "catalog/namespace.h"
@@ -67,6 +69,7 @@
 #include "utils/datetime.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "access/tupmacs.h"
 #include "utils/xml.h"
 
 
@@ -88,6 +91,7 @@
 static int		parse_xml_decl(const xmlChar *str, size_t *lenp, xmlChar **version, xmlChar **encoding, int *standalone);
 static bool		print_xml_decl(StringInfo buf, const xmlChar *version, pg_enc encoding, int standalone);
 static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace, xmlChar *encoding);
+static text		*xml_xmlnodetotext(xmlNodePtr cur);
 
 #endif /* USE_LIBXML */
 
@@ -1463,7 +1467,6 @@
 	return buf.data;
 }
 
-
 /*
  * Map SQL value to XML value; see SQL/XML:2003 section 9.16.
  */
@@ -2403,3 +2406,247 @@
 	else
 		appendStringInfoString(result, "</row>\n\n");
 }
+
+
+/*
+ * XPath related functions
+ */
+
+#ifdef USE_LIBXML
+/* 
+ * Convert XML node to text (return value only, it's not dumping)
+ */
+text *
+xml_xmlnodetotext(xmlNodePtr cur)
+{
+	xmlChar    		*str;
+	text			*result;
+	size_t			len;	
+	
+	str = xmlXPathCastNodeToString(cur);
+	len = strlen((char *) str);
+	result = (text *) palloc(len + VARHDRSZ);
+	SET_VARSIZE(result, len + VARHDRSZ);
+	memcpy(VARDATA(result), str, len);
+	
+	return result;
+}
+#endif
+
+/*
+ * Evaluate XPath expression and return array of XML values.
+ * As we have no support of XQuery sequences yet, this functions seems
+ * to be the most useful one (array of XML functions plays a role of
+ * some kind of substritution for XQuery sequences).
+
+ * Workaround here: we parse XML data in different way to allow XPath for
+ * fragments (see "XPath for fragment" TODO comment inside).
+ */
+Datum
+xmlpath(PG_FUNCTION_ARGS)
+{
+#ifdef USE_LIBXML
+	ArrayBuildState		*astate = NULL;
+	xmlParserCtxtPtr	ctxt = NULL;
+	xmlDocPtr			doc = NULL;
+	xmlXPathContextPtr	xpathctx = NULL;
+	xmlXPathCompExprPtr	xpathcomp = NULL;
+	xmlXPathObjectPtr	xpathobj = NULL;
+	int32				len, xpath_len;
+	xmlChar				*string, *xpath_expr;
+	bool				res_is_null = FALSE;
+	int					i;
+	xmltype				*data;
+	text				*xpath_expr_text;
+	ArrayType			*namespaces;
+	int					*dims, ndims, ns_count = 0, bitmask = 1;
+	char				*ptr;
+	bits8				*bitmap;
+	char				**ns_names = NULL, **ns_uris = NULL;
+	int16				typlen;
+	bool				typbyval;
+	char				typalign;
+	
+	/* the function is not strict, we must check first two args */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1))
+		PG_RETURN_NULL();
+	
+	xpath_expr_text = PG_GETARG_TEXT_P(0);
+	data  = PG_GETARG_XML_P(1);
+	
+	/* Namespace mappings passed as text[].
+	 * Assume that 2-dimensional array has been passed, 
+	 * the 1st subarray is array of names, the 2nd -- array of URIs,
+	 * example: ARRAY[ARRAY['myns', 'myns2'], ARRAY['http://example.com', 'http://example2.com']]. 
+	 */
+	if (!PG_ARGISNULL(2))
+	{
+		namespaces = PG_GETARG_ARRAYTYPE_P(2);
+		ndims = ARR_NDIM(namespaces);
+		dims = ARR_DIMS(namespaces);
+		
+		/* Sanity check */
+		if (ndims != 2)
+			ereport(ERROR, (errmsg("invalid array passed for namespace mappings"),
+							errdetail("Only 2-dimensional array may be used for namespace mappings.")));
+		
+		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+		
+		ns_count = ArrayGetNItems(ndims, dims) / 2;
+		get_typlenbyvalalign(ARR_ELEMTYPE(namespaces),
+							 &typlen, &typbyval, &typalign);
+		ns_names = (char **) palloc(ns_count * sizeof(char *));
+		ns_uris = (char **) palloc(ns_count * sizeof(char *));
+		ptr = ARR_DATA_PTR(namespaces);
+		bitmap = ARR_NULLBITMAP(namespaces);
+		bitmask = 1;
+		
+		for (i = 0; i < ns_count * 2; i++)
+		{
+			if (bitmap && (*bitmap & bitmask) == 0)
+				ereport(ERROR, (errmsg("neither namespace nor URI may be NULL"))); /* TODO: better message */
+			else
+			{
+				if (i < ns_count)
+					ns_names[i] = DatumGetCString(DirectFunctionCall1(textout,
+														  PointerGetDatum(ptr)));
+				else
+					ns_uris[i - ns_count] = DatumGetCString(DirectFunctionCall1(textout,
+														  PointerGetDatum(ptr)));
+				ptr = att_addlength(ptr, typlen, PointerGetDatum(ptr));
+				ptr = (char *) att_align(ptr, typalign);
+			}
+	
+			/* advance bitmap pointer if any */
+			if (bitmap)
+			{
+				bitmask <<= 1;
+				if (bitmask == 0x100)
+				{
+					bitmap++;
+					bitmask = 1;
+				}
+			}
+		}
+	}
+	
+	len = VARSIZE(data) - VARHDRSZ;
+	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+	if (xpath_len == 0)
+		ereport(ERROR, (errmsg("empty XPath expression")));
+	
+	if (xmlStrncmp((xmlChar *) VARDATA(data), (xmlChar *) "<?xml", 5) == 0)
+	{
+		string = palloc(len + 1);
+		memcpy(string, VARDATA(data), len);
+		string[len] = '\0';
+		xpath_expr = palloc(xpath_len + 1);
+		memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+		xpath_expr[xpath_len] = '\0';
+	}
+	else
+	{
+		/* use "<x>...</x>" as dummy root element to enable XPath for fragments */
+		/* TODO: (XPath for fragment) find better solution to work with XML fragment! */
+		string = xmlStrncatNew((xmlChar *) "<x>", (xmlChar *) VARDATA(data), len);
+		string = xmlStrncat(string, (xmlChar *) "</x>", 5);
+		len += 7;
+		xpath_expr = xmlStrncatNew((xmlChar *) "/x", (xmlChar *) VARDATA(xpath_expr_text), xpath_len);
+		len += 2;
+	}
+	
+	xml_init();
+
+	PG_TRY();
+	{
+		/* redundant XML parsing (two parsings for the same value in the same session are possible) */
+		ctxt = xmlNewParserCtxt();
+		if (ctxt == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not allocate parser context");
+		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+		if (doc == NULL)
+			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+						"could not parse XML data");
+		xpathctx = xmlXPathNewContext(doc);
+		if (xpathctx == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not allocate XPath context");
+		xpathctx->node = xmlDocGetRootElement(doc);
+		if (xpathctx->node == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not find root XML element"); 
+
+		/* register namespaces, if any */
+		if ((ns_count > 0) && ns_names && ns_uris)
+			for (i = 0; i < ns_count; i++)
+				if (0 != xmlXPathRegisterNs(xpathctx, (xmlChar *) ns_names[i], (xmlChar *) ns_uris[i]))
+					ereport(ERROR, 
+						(errmsg("could not register XML namespace with prefix=\"%s\" and href=\"%s\"", ns_names[i], ns_uris[i])));
+		
+		xpathcomp = xmlXPathCompile(xpath_expr);
+		if (xpathcomp == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"invalid XPath expression"); /* TODO: show proper XPath error details */
+		
+		xpathobj = xmlXPathCompiledEval(xpathcomp, xpathctx);
+		xmlXPathFreeCompExpr(xpathcomp);
+		if (xpathobj == NULL)
+			ereport(ERROR, (errmsg("could not create XPath object")));
+		
+		if (xpathobj->nodesetval == NULL)
+			res_is_null = TRUE;
+		
+		if (!res_is_null && xpathobj->nodesetval->nodeNr == 0)
+			/* TODO maybe empty array should be here, not NULL? (if so -- fix segfault) */
+			/*PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));*/
+			res_is_null = TRUE;
+		
+		if (!res_is_null) 
+			for (i = 0; i < xpathobj->nodesetval->nodeNr; i++)
+			{
+				Datum		elem;
+				bool		elemisnull = false;
+				elem = PointerGetDatum(xml_xmlnodetotext(xpathobj->nodesetval->nodeTab[i]));
+				astate = accumArrayResult(astate, elem,
+										  elemisnull, XMLOID,
+										  CurrentMemoryContext);
+			}
+		
+		xmlXPathFreeObject(xpathobj);
+		xmlXPathFreeContext(xpathctx);
+		xmlFreeParserCtxt(ctxt);
+		xmlFreeDoc(doc);
+		xmlCleanupParser();
+	}
+	PG_CATCH();
+	{
+		if (xpathcomp)
+			xmlXPathFreeCompExpr(xpathcomp);
+		if (xpathobj)
+			xmlXPathFreeObject(xpathobj);
+		if (xpathctx)
+			xmlXPathFreeContext(xpathctx);
+		if (doc)
+			xmlFreeDoc(doc);
+		if (ctxt)
+			xmlFreeParserCtxt(ctxt);
+		xmlCleanupParser();
+
+		PG_RE_THROW();
+	}
+	PG_END_TRY();
+	
+	if (res_is_null)
+	{
+		PG_RETURN_NULL();
+	}
+	else
+	{
+		PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));
+	}
+#else
+	NO_XML_SUPPORT();
+	return 0;
+#endif
+}
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.447
diff -u -r1.447 pg_proc.h
--- src/include/catalog/pg_proc.h	3 Mar 2007 19:52:46 -0000	1.447
+++ src/include/catalog/pg_proc.h	5 Mar 2007 01:14:57 -0000
@@ -4073,6 +4073,10 @@
 DATA(insert OID = 2930 (  query_to_xml_and_xmlschema  PGNSP PGUID 12 100 0 f f t f s 4 142 "25 16 16 25" _null_ _null_ "{query,nulls,tableforest,targetns}" query_to_xml_and_xmlschema - _null_ ));
 DESCR("map query result and structure to XML and XML Schema");
 
+DATA(insert OID = 2931 (  xmlpath      PGNSP PGUID 12 1 0 f f f f i 3 143 "25 142 1009" _null_ _null_ _null_ xmlpath - _null_ ));
+DESCR("evaluate XPath expression, with namespaces support");
+DATA(insert OID = 2932 (  xmlpath      PGNSP PGUID 14 1 0 f f f f i 2 143 "25 142" _null_ _null_ _null_ "select pg_catalog.xmlpath($1, $2, NULL)" - _null_ ));
+DESCR("evaluate XPath expression");
 
 /* uuid */ 
 DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 f f t f i 1 2950 "2275" _null_ _null_ _null_ uuid_in - _null_ ));
Index: src/test/regress/expected/xml_1.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/xml_1.out,v
retrieving revision 1.13
diff -u -r1.13 xml_1.out
--- src/test/regress/expected/xml_1.out	15 Feb 2007 05:05:03 -0000	1.13
+++ src/test/regress/expected/xml_1.out	5 Mar 2007 01:14:58 -0000
@@ -197,3 +197,18 @@
  xmlview5   | SELECT XMLPARSE(CONTENT '<abc>x</abc>'::text STRIP WHITESPACE) AS "xmlparse";
 (2 rows)
 
+-- Text XPath expressions evaluation
+SELECT xmlpath('/value', data) FROM xmltest;
+ xmlpath 
+---------
+(0 rows)
+
+SELECT xmlpath(NULL, NULL) IS NULL FROM xmltest;
+ERROR:  no XML support in this installation
+CONTEXT:  SQL function "xmlpath" statement 1
+SELECT xmlpath('', '<!-- error -->');
+ERROR:  no XML support in this installation
+SELECT xmlpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
+ERROR:  no XML support in this installation
+SELECT xmlpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
+ERROR:  no XML support in this installation
Index: src/test/regress/expected/xml.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/xml.out,v
retrieving revision 1.15
diff -u -r1.15 xml.out
--- src/test/regress/expected/xml.out	15 Feb 2007 05:05:03 -0000	1.15
+++ src/test/regress/expected/xml.out	5 Mar 2007 01:14:58 -0000
@@ -401,3 +401,33 @@
  xmlview9   | SELECT XMLSERIALIZE(CONTENT 'good'::"xml" AS text) AS "xmlserialize";
 (9 rows)
 
+-- Text XPath expressions evaluation
+SELECT xmlpath('/value', data) FROM xmltest;
+ xmlpath 
+---------
+ {one}
+ {two}
+(2 rows)
+
+SELECT xmlpath(NULL, NULL) IS NULL FROM xmltest;
+ ?column? 
+----------
+ t
+ t
+(2 rows)
+
+SELECT xmlpath('', '<!-- error -->');
+ERROR:  empty XPath expression
+CONTEXT:  SQL function "xmlpath" statement 1
+SELECT xmlpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
+    xmlpath     
+----------------
+ {"number one"}
+(1 row)
+
+SELECT xmlpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
+ xmlpath 
+---------
+ {1,2}
+(1 row)
+
Index: src/include/utils/xml.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/xml.h,v
retrieving revision 1.16
diff -u -r1.16 xml.h
--- src/include/utils/xml.h	16 Feb 2007 07:46:55 -0000	1.16
+++ src/include/utils/xml.h	5 Mar 2007 01:14:57 -0000
@@ -36,6 +36,7 @@
 extern Datum texttoxml(PG_FUNCTION_ARGS);
 extern Datum xmltotext(PG_FUNCTION_ARGS);
 extern Datum xmlvalidate(PG_FUNCTION_ARGS);
+extern Datum xmlpath(PG_FUNCTION_ARGS);
 
 extern Datum table_to_xml(PG_FUNCTION_ARGS);
 extern Datum query_to_xml(PG_FUNCTION_ARGS);
Index: src/test/regress/sql/xml.sql
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/sql/xml.sql,v
retrieving revision 1.12
diff -u -r1.12 xml.sql
--- src/test/regress/sql/xml.sql	15 Feb 2007 05:05:03 -0000	1.12
+++ src/test/regress/sql/xml.sql	5 Mar 2007 01:14:58 -0000
@@ -144,3 +144,11 @@
 
 SELECT table_name, view_definition FROM information_schema.views
   WHERE table_name LIKE 'xmlview%' ORDER BY 1;
+
+-- Text XPath expressions evaluation
+
+SELECT xmlpath('/value', data) FROM xmltest;
+SELECT xmlpath(NULL, NULL) IS NULL FROM xmltest;
+SELECT xmlpath('', '<!-- error -->');
+SELECT xmlpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
+SELECT xmlpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
#7Andrew Dunstan
andrew@dunslane.net
In reply to: Nikolay Samokhvalov (#6)
Re: [PATCHES] xpath_array with namespaces support

Nikolay Samokhvalov wrote:

What about it? W/o this not large patch XML functionality in 8.3 will
be weak...
Will it be accepted?

In principle I am in favor of the patch.

Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?

Perhaps even something in a special namespace?

cheers

andrew

#8Nikolay Samokhvalov
nikolay@samokhvalov.com
In reply to: Andrew Dunstan (#7)
Re: [PATCHES] xpath_array with namespaces support

On 3/17/07, Andrew Dunstan <andrew@dunslane.net> wrote:

In principle I am in favor of the patch.

Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?

Perhaps even something in a special namespace?

I did think about it, but I didn't find any difficulties with simple
<x>...</x>. The thing is that regardless the element name we have
corresponding shift in XPath epression -- so, there cannot be any
problem from my point of view... But maybe I don't see something and
it's better to avoid _possible_ problem. It depends on PostgreSQL code
style itself -- what is the best approach in such cases? To avoid
unknown possible difficulties or to be clear?

--
Best regards,
Nikolay

#9Andrew Dunstan
andrew@dunslane.net
In reply to: Nikolay Samokhvalov (#8)
Re: [PATCHES] xpath_array with namespaces support

Nikolay Samokhvalov wrote:

On 3/17/07, Andrew Dunstan <andrew@dunslane.net> wrote:

In principle I am in favor of the patch.

Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?

Perhaps even something in a special namespace?

I did think about it, but I didn't find any difficulties with simple
<x>...</x>. The thing is that regardless the element name we have
corresponding shift in XPath epression -- so, there cannot be any
problem from my point of view... But maybe I don't see something and
it's better to avoid _possible_ problem. It depends on PostgreSQL code
style itself -- what is the best approach in such cases? To avoid
unknown possible difficulties or to be clear?

If you are sure that it won't cause a problem then I think it's ok to
leave it, as long as there is a comment in the code that says why we are
sure it's ok.

cheers

andrew

#10Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Nikolay Samokhvalov (#5)
1 attachment(s)
Re: [PATCHES] xpath_array with namespaces support

On 3/5/07, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:

On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:

I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.

Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().

Updated version of the patch contains bugfix: there were a problem
with path queries that pointed to elements (cases when a set of
document parts that correspond to subtrees should be returned).
Example is (included in regression test):

xmltest=# SELECT xmlpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
xmlpath
-------------------------
{<b>two</b>,<b>etc</b>}
(1 row)

Waiting for more feedback, please check it.

--
Best regards,
Nikolay

Attachments:

xpath.w.namespaces.20070318.patchtext/x-patch; charset=ANSI_X3.4-1968; name=xpath.w.namespaces.20070318.patchDownload
Index: src/backend/utils/adt/xml.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/xml.c,v
retrieving revision 1.35
diff -u -r1.35 xml.c
--- src/backend/utils/adt/xml.c	15 Mar 2007 23:12:06 -0000	1.35
+++ src/backend/utils/adt/xml.c	18 Mar 2007 13:32:21 -0000
@@ -47,6 +47,8 @@
 #include <libxml/uri.h>
 #include <libxml/xmlerror.h>
 #include <libxml/xmlwriter.h>
+#include <libxml/xpath.h>
+#include <libxml/xpathInternals.h>
 #endif /* USE_LIBXML */
 
 #include "catalog/namespace.h"
@@ -67,6 +69,7 @@
 #include "utils/datetime.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
+#include "access/tupmacs.h"
 #include "utils/xml.h"
 
 
@@ -88,6 +91,7 @@
 static int		parse_xml_decl(const xmlChar *str, size_t *lenp, xmlChar **version, xmlChar **encoding, int *standalone);
 static bool		print_xml_decl(StringInfo buf, const xmlChar *version, pg_enc encoding, int standalone);
 static xmlDocPtr xml_parse(text *data, XmlOptionType xmloption_arg, bool preserve_whitespace, xmlChar *encoding);
+static text		*xml_xmlnodetoxmltype(xmlNodePtr cur);
 
 #endif /* USE_LIBXML */
 
@@ -1463,7 +1467,6 @@
 	return buf.data;
 }
 
-
 /*
  * Map SQL value to XML value; see SQL/XML:2003 section 9.16.
  */
@@ -2403,3 +2406,258 @@
 	else
 		appendStringInfoString(result, "</row>\n\n");
 }
+
+
+/*
+ * XPath related functions
+ */
+
+#ifdef USE_LIBXML
+/* 
+ * Convert XML node to text (dump subtree in case of element, return value otherwise)
+ */
+text *
+xml_xmlnodetoxmltype(xmlNodePtr cur)
+{
+	xmlChar    			*str;
+	xmltype				*result;
+	size_t				len;
+	xmlBufferPtr 		buf;
+	
+	if (cur->type == XML_ELEMENT_NODE)
+	{
+		buf = xmlBufferCreate();
+		xmlNodeDump(buf, NULL, cur, 0, 1);
+		result = xmlBuffer_to_xmltype(buf);
+		xmlBufferFree(buf);
+	}
+	else
+	{
+		str = xmlXPathCastNodeToString(cur);
+		len = strlen((char *) str);
+		result = (text *) palloc(len + VARHDRSZ);
+		SET_VARSIZE(result, len + VARHDRSZ);
+		memcpy(VARDATA(result), str, len);
+	}
+	
+	return result;
+}
+#endif
+
+/*
+ * Evaluate XPath expression and return array of XML values.
+ * As we have no support of XQuery sequences yet, this functions seems
+ * to be the most useful one (array of XML functions plays a role of
+ * some kind of substritution for XQuery sequences).
+
+ * Workaround here: we parse XML data in different way to allow XPath for
+ * fragments (see "XPath for fragment" TODO comment inside).
+ */
+Datum
+xmlpath(PG_FUNCTION_ARGS)
+{
+#ifdef USE_LIBXML
+	ArrayBuildState		*astate = NULL;
+	xmlParserCtxtPtr	ctxt = NULL;
+	xmlDocPtr			doc = NULL;
+	xmlXPathContextPtr	xpathctx = NULL;
+	xmlXPathCompExprPtr	xpathcomp = NULL;
+	xmlXPathObjectPtr	xpathobj = NULL;
+	int32				len, xpath_len;
+	xmlChar				*string, *xpath_expr;
+	bool				res_is_null = FALSE;
+	int					i;
+	xmltype				*data;
+	text				*xpath_expr_text;
+	ArrayType			*namespaces;
+	int					*dims, ndims, ns_count = 0, bitmask = 1;
+	char				*ptr;
+	bits8				*bitmap;
+	char				**ns_names = NULL, **ns_uris = NULL;
+	int16				typlen;
+	bool				typbyval;
+	char				typalign;
+	
+	/* the function is not strict, we must check first two args */
+	if (PG_ARGISNULL(0) || PG_ARGISNULL(1))
+		PG_RETURN_NULL();
+	
+	xpath_expr_text = PG_GETARG_TEXT_P(0);
+	data  = PG_GETARG_XML_P(1);
+	
+	/* Namespace mappings passed as text[].
+	 * Assume that 2-dimensional array has been passed, 
+	 * the 1st subarray is array of names, the 2nd -- array of URIs,
+	 * example: ARRAY[ARRAY['myns', 'myns2'], ARRAY['http://example.com', 'http://example2.com']]. 
+	 */
+	if (!PG_ARGISNULL(2))
+	{
+		namespaces = PG_GETARG_ARRAYTYPE_P(2);
+		ndims = ARR_NDIM(namespaces);
+		dims = ARR_DIMS(namespaces);
+		
+		/* Sanity check */
+		if (ndims != 2)
+			ereport(ERROR, (errmsg("invalid array passed for namespace mappings"),
+							errdetail("Only 2-dimensional array may be used for namespace mappings.")));
+		
+		Assert(ARR_ELEMTYPE(namespaces) == TEXTOID);
+		
+		ns_count = ArrayGetNItems(ndims, dims) / 2;
+		get_typlenbyvalalign(ARR_ELEMTYPE(namespaces),
+							 &typlen, &typbyval, &typalign);
+		ns_names = (char **) palloc(ns_count * sizeof(char *));
+		ns_uris = (char **) palloc(ns_count * sizeof(char *));
+		ptr = ARR_DATA_PTR(namespaces);
+		bitmap = ARR_NULLBITMAP(namespaces);
+		bitmask = 1;
+		
+		for (i = 0; i < ns_count * 2; i++)
+		{
+			if (bitmap && (*bitmap & bitmask) == 0)
+				ereport(ERROR, (errmsg("neither namespace nor URI may be NULL"))); /* TODO: better message */
+			else
+			{
+				if (i < ns_count)
+					ns_names[i] = DatumGetCString(DirectFunctionCall1(textout,
+														  PointerGetDatum(ptr)));
+				else
+					ns_uris[i - ns_count] = DatumGetCString(DirectFunctionCall1(textout,
+														  PointerGetDatum(ptr)));
+				ptr = att_addlength(ptr, typlen, PointerGetDatum(ptr));
+				ptr = (char *) att_align(ptr, typalign);
+			}
+	
+			/* advance bitmap pointer if any */
+			if (bitmap)
+			{
+				bitmask <<= 1;
+				if (bitmask == 0x100)
+				{
+					bitmap++;
+					bitmask = 1;
+				}
+			}
+		}
+	}
+	
+	len = VARSIZE(data) - VARHDRSZ;
+	xpath_len = VARSIZE(xpath_expr_text) - VARHDRSZ;
+	if (xpath_len == 0)
+		ereport(ERROR, (errmsg("empty XPath expression")));
+	
+	if (xmlStrncmp((xmlChar *) VARDATA(data), (xmlChar *) "<?xml", 5) == 0)
+	{
+		string = palloc(len + 1);
+		memcpy(string, VARDATA(data), len);
+		string[len] = '\0';
+		xpath_expr = palloc(xpath_len + 1);
+		memcpy(xpath_expr, VARDATA(xpath_expr_text), xpath_len);
+		xpath_expr[xpath_len] = '\0';
+	}
+	else
+	{
+		/* use "<x>...</x>" as dummy root element to enable XPath for fragments */
+		/* TODO: (XPath for fragment) find better solution to work with XML fragment! */
+		string = xmlStrncatNew((xmlChar *) "<x>", (xmlChar *) VARDATA(data), len);
+		string = xmlStrncat(string, (xmlChar *) "</x>", 5);
+		len += 7;
+		xpath_expr = xmlStrncatNew((xmlChar *) "/x", (xmlChar *) VARDATA(xpath_expr_text), xpath_len);
+		len += 2;
+	}
+	
+	xml_init();
+
+	PG_TRY();
+	{
+		/* redundant XML parsing (two parsings for the same value in the same session are possible) */
+		ctxt = xmlNewParserCtxt();
+		if (ctxt == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not allocate parser context");
+		doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL, 0);
+		if (doc == NULL)
+			xml_ereport(ERROR, ERRCODE_INVALID_XML_DOCUMENT,
+						"could not parse XML data");
+		xpathctx = xmlXPathNewContext(doc);
+		if (xpathctx == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not allocate XPath context");
+		xpathctx->node = xmlDocGetRootElement(doc);
+		if (xpathctx->node == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"could not find root XML element"); 
+
+		/* register namespaces, if any */
+		if ((ns_count > 0) && ns_names && ns_uris)
+			for (i = 0; i < ns_count; i++)
+				if (0 != xmlXPathRegisterNs(xpathctx, (xmlChar *) ns_names[i], (xmlChar *) ns_uris[i]))
+					ereport(ERROR, 
+						(errmsg("could not register XML namespace with prefix=\"%s\" and href=\"%s\"", ns_names[i], ns_uris[i])));
+		
+		xpathcomp = xmlXPathCompile(xpath_expr);
+		if (xpathcomp == NULL)
+			xml_ereport(ERROR, ERRCODE_INTERNAL_ERROR,
+						"invalid XPath expression"); /* TODO: show proper XPath error details */
+		
+		xpathobj = xmlXPathCompiledEval(xpathcomp, xpathctx);
+		xmlXPathFreeCompExpr(xpathcomp);
+		if (xpathobj == NULL)
+			ereport(ERROR, (errmsg("could not create XPath object")));
+		
+		if (xpathobj->nodesetval == NULL)
+			res_is_null = TRUE;
+		
+		if (!res_is_null && xpathobj->nodesetval->nodeNr == 0)
+			/* TODO maybe empty array should be here, not NULL? (if so -- fix segfault) */
+			/*PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));*/
+			res_is_null = TRUE;
+		
+		if (!res_is_null) 
+			for (i = 0; i < xpathobj->nodesetval->nodeNr; i++)
+			{
+				Datum		elem;
+				bool		elemisnull = false;
+				elem = PointerGetDatum(xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[i]));
+				astate = accumArrayResult(astate, elem,
+										  elemisnull, XMLOID,
+										  CurrentMemoryContext);
+			}
+		
+		xmlXPathFreeObject(xpathobj);
+		xmlXPathFreeContext(xpathctx);
+		xmlFreeParserCtxt(ctxt);
+		xmlFreeDoc(doc);
+		xmlCleanupParser();
+	}
+	PG_CATCH();
+	{
+		if (xpathcomp)
+			xmlXPathFreeCompExpr(xpathcomp);
+		if (xpathobj)
+			xmlXPathFreeObject(xpathobj);
+		if (xpathctx)
+			xmlXPathFreeContext(xpathctx);
+		if (doc)
+			xmlFreeDoc(doc);
+		if (ctxt)
+			xmlFreeParserCtxt(ctxt);
+		xmlCleanupParser();
+
+		PG_RE_THROW();
+	}
+	PG_END_TRY();
+	
+	if (res_is_null)
+	{
+		PG_RETURN_NULL();
+	}
+	else
+	{
+		PG_RETURN_ARRAYTYPE_P(makeArrayResult(astate, CurrentMemoryContext));
+	}
+#else
+	NO_XML_SUPPORT();
+	return 0;
+#endif
+}
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.448
diff -u -r1.448 pg_proc.h
--- src/include/catalog/pg_proc.h	16 Mar 2007 17:57:36 -0000	1.448
+++ src/include/catalog/pg_proc.h	18 Mar 2007 13:32:21 -0000
@@ -4083,6 +4083,10 @@
 DATA(insert OID = 2930 (  query_to_xml_and_xmlschema  PGNSP PGUID 12 100 0 f f t f s 4 142 "25 16 16 25" _null_ _null_ "{query,nulls,tableforest,targetns}" query_to_xml_and_xmlschema - _null_ ));
 DESCR("map query result and structure to XML and XML Schema");
 
+DATA(insert OID = 2931 (  xmlpath      PGNSP PGUID 12 1 0 f f f f i 3 143 "25 142 1009" _null_ _null_ _null_ xmlpath - _null_ ));
+DESCR("evaluate XPath expression, with namespaces support");
+DATA(insert OID = 2932 (  xmlpath      PGNSP PGUID 14 1 0 f f f f i 2 143 "25 142" _null_ _null_ _null_ "select pg_catalog.xmlpath($1, $2, NULL)" - _null_ ));
+DESCR("evaluate XPath expression");
 
 /* uuid */ 
 DATA(insert OID = 2952 (  uuid_in		   PGNSP PGUID 12 1 0 f f t f i 1 2950 "2275" _null_ _null_ _null_ uuid_in - _null_ ));
Index: src/test/regress/expected/xml_1.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/xml_1.out,v
retrieving revision 1.13
diff -u -r1.13 xml_1.out
--- src/test/regress/expected/xml_1.out	15 Feb 2007 05:05:03 -0000	1.13
+++ src/test/regress/expected/xml_1.out	18 Mar 2007 13:32:21 -0000
@@ -197,3 +197,20 @@
  xmlview5   | SELECT XMLPARSE(CONTENT '<abc>x</abc>'::text STRIP WHITESPACE) AS "xmlparse";
 (2 rows)
 
+-- Text XPath expressions evaluation
+SELECT xmlpath('/value', data) FROM xmltest;
+ xmlpath 
+---------
+(0 rows)
+
+SELECT xmlpath(NULL, NULL) IS NULL FROM xmltest;
+ERROR:  no XML support in this installation
+CONTEXT:  SQL function "xmlpath" statement 1
+SELECT xmlpath('', '<!-- error -->');
+ERROR:  no XML support in this installation
+SELECT xmlpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
+ERROR:  no XML support in this installation
+SELECT xmlpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
+ERROR:  no XML support in this installation
+SELECT xmlpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
+ERROR:  no XML support in this installation
Index: src/test/regress/expected/xml.out
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/expected/xml.out,v
retrieving revision 1.15
diff -u -r1.15 xml.out
--- src/test/regress/expected/xml.out	15 Feb 2007 05:05:03 -0000	1.15
+++ src/test/regress/expected/xml.out	18 Mar 2007 13:32:21 -0000
@@ -401,3 +401,39 @@
  xmlview9   | SELECT XMLSERIALIZE(CONTENT 'good'::"xml" AS text) AS "xmlserialize";
 (9 rows)
 
+-- Text XPath expressions evaluation
+SELECT xmlpath('/value', data) FROM xmltest;
+       xmlpath        
+----------------------
+ {<value>one</value>}
+ {<value>two</value>}
+(2 rows)
+
+SELECT xmlpath(NULL, NULL) IS NULL FROM xmltest;
+ ?column? 
+----------
+ t
+ t
+(2 rows)
+
+SELECT xmlpath('', '<!-- error -->');
+ERROR:  empty XPath expression
+CONTEXT:  SQL function "xmlpath" statement 1
+SELECT xmlpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
+    xmlpath     
+----------------
+ {"number one"}
+(1 row)
+
+SELECT xmlpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
+ xmlpath 
+---------
+ {1,2}
+(1 row)
+
+SELECT xmlpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
+         xmlpath         
+-------------------------
+ {<b>two</b>,<b>etc</b>}
+(1 row)
+
Index: src/include/utils/xml.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/xml.h,v
retrieving revision 1.16
diff -u -r1.16 xml.h
--- src/include/utils/xml.h	16 Feb 2007 07:46:55 -0000	1.16
+++ src/include/utils/xml.h	18 Mar 2007 13:32:21 -0000
@@ -36,6 +36,7 @@
 extern Datum texttoxml(PG_FUNCTION_ARGS);
 extern Datum xmltotext(PG_FUNCTION_ARGS);
 extern Datum xmlvalidate(PG_FUNCTION_ARGS);
+extern Datum xmlpath(PG_FUNCTION_ARGS);
 
 extern Datum table_to_xml(PG_FUNCTION_ARGS);
 extern Datum query_to_xml(PG_FUNCTION_ARGS);
Index: src/test/regress/sql/xml.sql
===================================================================
RCS file: /projects/cvsroot/pgsql/src/test/regress/sql/xml.sql,v
retrieving revision 1.12
diff -u -r1.12 xml.sql
--- src/test/regress/sql/xml.sql	15 Feb 2007 05:05:03 -0000	1.12
+++ src/test/regress/sql/xml.sql	18 Mar 2007 13:32:21 -0000
@@ -144,3 +144,12 @@
 
 SELECT table_name, view_definition FROM information_schema.views
   WHERE table_name LIKE 'xmlview%' ORDER BY 1;
+
+-- Text XPath expressions evaluation
+
+SELECT xmlpath('/value', data) FROM xmltest;
+SELECT xmlpath(NULL, NULL) IS NULL FROM xmltest;
+SELECT xmlpath('', '<!-- error -->');
+SELECT xmlpath('//text()', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>');
+SELECT xmlpath('//loc:piece/@id', '<local:data xmlns:local="http://127.0.0.1"><local:piece id="1">number one</local:piece><local:piece id="2" /></local:data>', ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
+SELECT xmlpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
#11Bruce Momjian
bruce@momjian.us
In reply to: Nikolay Samokhvalov (#5)
Re: xpath_array with namespaces support

Patch applied.

Please provide a documentation addition. Thanks.

---------------------------------------------------------------------------

Nikolay Samokhvalov wrote:

On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:

I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.

Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().

--
Best regards,
Nikolay

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#12Bruce Momjian
bruce@momjian.us
In reply to: Nikolay Samokhvalov (#10)
Re: [PATCHES] xpath_array with namespaces support

Applying newest version of this patch now; still needs documentation.

---------------------------------------------------------------------------

Nikolay Samokhvalov wrote:

On 3/5/07, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:

On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:

I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.

Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().

Updated version of the patch contains bugfix: there were a problem
with path queries that pointed to elements (cases when a set of
document parts that correspond to subtrees should be returned).
Example is (included in regression test):

xmltest=# SELECT xmlpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
xmlpath
-------------------------
{<b>two</b>,<b>etc</b>}
(1 row)

Waiting for more feedback, please check it.

--
Best regards,
Nikolay

[ Attachment, skipping... ]

---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com

+ If your life is a hard drive, Christ can be your backup. +

#13Peter Eisentraut
peter_e@gmx.net
In reply to: Nikolay Samokhvalov (#5)
Re: xpath_array with namespaces support

Nikolay Samokhvalov wrote:

Here is a new version of the patch. I didn't change any part of docs
yet. Since there were no objections I've changed the name of the
function to xmlpath().

I didn't see any discussion about changing the name to xmlpath. Seeing
that the function implements xpath, and xpath is a recognized name,
this change is wrong.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#14Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#11)
Re: xpath_array with namespaces support

Bruce Momjian wrote:

Patch applied.

This code seems to think that if an xml datum starts with "<?xml" it's a
document. That is completely bogus.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#15Peter Eisentraut
peter_e@gmx.net
In reply to: Nikolay Samokhvalov (#5)
Re: xpath_array with namespaces support

Nikolay Samokhvalov wrote:

On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:

I'll fix these issues and extend the patch with resgression tests
and docs for xpath_array(). I'll resubmit it very soon.

Here is a new version of the patch. I didn't change any part of docs
yet. Since there were no objections I've changed the name of the
function to xmlpath().

Why is the function not strict?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#16Peter Eisentraut
peter_e@gmx.net
In reply to: Andrew Dunstan (#7)
Re: [PATCHES] xpath_array with namespaces support

Andrew Dunstan wrote:

Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?

Why do we even need to support xpath on fragments?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#17Peter Eisentraut
peter_e@gmx.net
In reply to: Nikolay Samokhvalov (#1)
Re: xpath_array with namespaces support

Nikolay Samokhvalov wrote:

Also, maybe someone can suggest better approach for passing namespace
bindings (more convenient than ARRAY[ARRAY[...], ARRAY[...]])?

Your code assumes

ARRAY[ARRAY['myns', 'myns2'], ARRAY['http://example.com&#39;, 'http://example2.com&#39;]]

Shouldn't it be

ARRAY[ARRAY['myns', 'http://example.com&#39;], ARRAY['myns2', 'http://example2.com&#39;]]

?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#18Nikolay Samokhvalov
nikolay@samokhvalov.com
In reply to: Peter Eisentraut (#16)
Re: [PATCHES] xpath_array with namespaces support

On 3/23/07, Peter Eisentraut <peter_e@gmx.net> wrote:

Andrew Dunstan wrote:

Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?

Why do we even need to support xpath on fragments?

Why not? I find it useful and convenient.

--
Best regards,
Nikolay

#19Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#11)
Re: xpath_array with namespaces support

Am Mittwoch, 4. April 2007 14:42 schrieb Nikolay Samokhvalov:

Maybe it's worth to start keeping additional information in xml datum (i.e.
bit IS_DOCUMENT and, what is more important for xpath() function, a bit
indicating that XML value has only one root and can be considered as a tree
=> there is no need to wrap with <x> .. </x>). But this change requires
additional time to design xml datum structure and to rework the code
(macros, I/O functions...).

To determine if an XML datum is a document, call xml_is_document(). The
implementation of that function is probably not the best possible one, but
what the xpath() code does it totally wrong nevertheless.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#20Peter Eisentraut
peter_e@gmx.net
In reply to: Nikolay Samokhvalov (#1)
Re: xpath_array with namespaces support

Am Mittwoch, 4. April 2007 14:42 schrieb Nikolay Samokhvalov:

Why is the function not strict?

Because in case of 3rd argument (NS mappings) being NULL, we shouldn't
return NULL immediately:

If the namespace mapping is NULL then it is unknown, and therefore the result
of the XPath expression cannot be evaluated with certainty. If no namespace
mapping is to be passed, then you should pass a list(/array/...) of length
zero.

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#21Peter Eisentraut
peter_e@gmx.net
In reply to: Nikolay Samokhvalov (#18)
Re: xpath_array with namespaces support

Am Mittwoch, 4. April 2007 14:43 schrieb Nikolay Samokhvalov:

Why do we even need to support xpath on fragments?

Why not? I find it useful and convenient.

Well, rather than inventing bogus root wrapper elements, why not let users
call xmlelement() to produce the wrapper element themselves?

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#22Nikolay Samokhvalov
nikolay@samokhvalov.com
In reply to: Peter Eisentraut (#21)
Re: [PATCHES] xpath_array with namespaces support

On 4/4/07, Peter Eisentraut <peter_e@gmx.net> wrote:

Am Mittwoch, 4. April 2007 14:43 schrieb Nikolay Samokhvalov:

Why do we even need to support xpath on fragments?

Why not? I find it useful and convenient.

Well, rather than inventing bogus root wrapper elements, why not let users
call xmlelement() to produce the wrapper element themselves?

User may even don't know in what case wrapper element is needed. I mean, if
user works with XML column containing both documents and fragments, then
what must he do? Add wrapper anyway? So, users will add XMLELEMENT in almost
any case.

I'd prefer to keep external interfaces simpler (less thinking in such cases
for users).

--
Best regards,
Nikolay

#23Peter Eisentraut
peter_e@gmx.net
In reply to: Bruce Momjian (#11)
Re: xpath_array with namespaces support

Am Mittwoch, 4. April 2007 15:20 schrieb Nikolay Samokhvalov:

To determine if an XML datum is a document, call xml_is_document(). The
implementation of that function is probably not the best possible one,
but what the xpath() code does it totally wrong nevertheless.

You are proposing 2-3 (depends on the case) parsing times for the one XML
value instead of current 1-2

I know it's bad, and something like adding a bit (byte) to mark this in the
value would be good, but that doesn't change the fact that

(xmlStrncmp((xmlChar *) VARDATA(data), (xmlChar *) "<?xml", 5) == 0)

is not a valid method to tell apart a document from a fragment. Proof:

pei=# select xml '<?xml version="1.0"?><foo>bar</foo>' IS DOCUMENT;
?column?
----------
t
(1 row)

pei=# select xml '<?xml version="1.0"?><foo>bar</foo><foo>bar</foo>' IS
DOCUMENT;
?column?
----------
f
(1 row)

pei=# select xml '<foo>bar</foo>' IS DOCUMENT;
?column?
----------
t
(1 row)

pei=# select xml '<foo>bar</foo><foo>bar</foo>' IS DOCUMENT;
?column?
----------
f
(1 row)

--
Peter Eisentraut
http://developer.postgresql.org/~petere/

#24Nikolay Samokhvalov
samokhvalov@gmail.com
In reply to: Bruce Momjian (#11)
Re: xpath_array with namespaces support

On 4/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:

So, choosing between two inefficient approaches:
1. mine, which in some cases use dummy element wrapping, that we could
escape;
2. proposed by you, which leads to +1 parsing.
... I'd definitely choose the first one.

I'd make it a bit more clear.

We have different cases for XML value as input of xpath():
a. document with prolog ('<?xml...')
b. document w/o prolog (value that can be represented as a tree -- i.e. we
have one root)
c. fragment with one root element (can be represented as a tree)
d. fragment w/o root element (cannot be represented as a tree, e.g.
'bla'::xml)

So, the current implementation works w/o wrapping in case a) and use
wrapping for cases b)-d).
But we _need_ wrapping _only_ in case d) -- so there is space for
optimization (I would keep bit "this value is not a tree" in the value
itself).

--
Best regards,
Nikolay