xpath_array with namespaces support
As a result of discussion with Peter, I provide modified patch for
xpath_array() with namespaces support.
The signature is:
_xml xpath_array(text xpathQuery, xml xmlValue[, _text namespacesBindings])
The third argument is 2-dimensional array defining bindings for
namespaces. Simple examples:
xmltest=# SELECT xpath_array('//text()', '<local:data
xmlns:local="http://127.0.0.1"><local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>');
xpath_array
----------------
{"number one"}
(1 row)
xmltest=# SELECT xpath_array('//loc:piece/@id', '<local:data
xmlns:local="http://127.0.0.1"><local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>',
ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
xpath_array
-------------
{1,2}
(1 row)
Thoughts regarding other XPath functions were exposed a couple of days
ago: http://archives.postgresql.org/pgsql-patches/2007-02/msg00373.php
If there is no objections, we could call the function provided in this
patch as xpath() or xmlpath() (the latter is similar to SQL/XML
functions).
Also, maybe someone can suggest better approach for passing namespace
bindings (more convenient than ARRAY[ARRAY[...], ARRAY[...]])?
--
Best regards,
Nikolay
Attachments:
xpath.w.namespaces.20070220.patchtext/x-patch; charset=ANSI_X3.4-1968; name=xpath.w.namespaces.20070220.patchDownload+244-1
Your patch has been added to the PostgreSQL unapplied patches list at:
http://momjian.postgresql.org/cgi-bin/pgpatches
It will be applied as soon as one of the PostgreSQL committers reviews
and approves it.
---------------------------------------------------------------------------
Nikolay Samokhvalov wrote:
As a result of discussion with Peter, I provide modified patch for
xpath_array() with namespaces support.The signature is:
_xml xpath_array(text xpathQuery, xml xmlValue[, _text namespacesBindings])The third argument is 2-dimensional array defining bindings for
namespaces. Simple examples:xmltest=# SELECT xpath_array('//text()', '<local:data
xmlns:local="http://127.0.0.1"><local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>');
xpath_array
----------------
{"number one"}
(1 row)xmltest=# SELECT xpath_array('//loc:piece/@id', '<local:data
xmlns:local="http://127.0.0.1"><local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>',
ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
xpath_array
-------------
{1,2}
(1 row)Thoughts regarding other XPath functions were exposed a couple of days
ago: http://archives.postgresql.org/pgsql-patches/2007-02/msg00373.phpIf there is no objections, we could call the function provided in this
patch as xpath() or xmlpath() (the latter is similar to SQL/XML
functions).Also, maybe someone can suggest better approach for passing namespace
bindings (more convenient than ARRAY[ARRAY[...], ARRAY[...]])?--
Best regards,
Nikolay
[ Attachment, skipping... ]
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
I tried this patch bug found this regression failure:
-- Considering only built-in procs (prolang = 12), look for multiple uses
-- of the same internal function (ie, matching prosrc fields). It's OK to
-- have several entries with different pronames for the same internal function,
-- but conflicts in the number of arguments and other critical items should
-- be complained of. (We don't check data types here; see next query.)
-- Note: ignore aggregate functions here, since they all point to the same
-- dummy built-in function.
SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
(p1.proisagg = false OR p2.proisagg = false) AND
(p1.prolang != p2.prolang OR
p1.proisagg != p2.proisagg OR
p1.prosecdef != p2.prosecdef OR
p1.proisstrict != p2.proisstrict OR
p1.proretset != p2.proretset OR
p1.provolatile != p2.provolatile OR
p1.pronargs != p2.pronargs);
oid | proname | oid | proname
------+-------------+------+-------------
2931 | xpath_array | 2932 | xpath_array
(1 row)
This is because you are calling xpath_array with 2 and 3 arguments.
Seems we don't do this anywhere else.
I also had to add a #ifdef USE_LIBXML around xml_xmlnodetotext(). Please
research a fix to this an resubmit. Thanks.
---------------------------------------------------------------------------
Nikolay Samokhvalov wrote:
As a result of discussion with Peter, I provide modified patch for
xpath_array() with namespaces support.The signature is:
_xml xpath_array(text xpathQuery, xml xmlValue[, _text namespacesBindings])The third argument is 2-dimensional array defining bindings for
namespaces. Simple examples:xmltest=# SELECT xpath_array('//text()', '<local:data
xmlns:local="http://127.0.0.1"><local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>');
xpath_array
----------------
{"number one"}
(1 row)xmltest=# SELECT xpath_array('//loc:piece/@id', '<local:data
xmlns:local="http://127.0.0.1"><local:piece id="1">number
one</local:piece><local:piece id="2" /></local:data>',
ARRAY[ARRAY['loc'], ARRAY['http://127.0.0.1']]);
xpath_array
-------------
{1,2}
(1 row)Thoughts regarding other XPath functions were exposed a couple of days
ago: http://archives.postgresql.org/pgsql-patches/2007-02/msg00373.phpIf there is no objections, we could call the function provided in this
patch as xpath() or xmlpath() (the latter is similar to SQL/XML
functions).Also, maybe someone can suggest better approach for passing namespace
bindings (more convenient than ARRAY[ARRAY[...], ARRAY[...]])?--
Best regards,
Nikolay
[ Attachment, skipping... ]
---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
On 3/3/07, Bruce Momjian <bruce@momjian.us> wrote:
I tried this patch bug found this regression failure:
-- Considering only built-in procs (prolang = 12), look for multiple uses
-- of the same internal function (ie, matching prosrc fields). It's OK to
-- have several entries with different pronames for the same internal function,
-- but conflicts in the number of arguments and other critical items should
-- be complained of. (We don't check data types here; see next query.)
-- Note: ignore aggregate functions here, since they all point to the same
-- dummy built-in function.
SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
(p1.proisagg = false OR p2.proisagg = false) AND
(p1.prolang != p2.prolang OR
p1.proisagg != p2.proisagg OR
p1.prosecdef != p2.prosecdef OR
p1.proisstrict != p2.proisstrict OR
p1.proretset != p2.proretset OR
p1.provolatile != p2.provolatile OR
p1.pronargs != p2.pronargs);
oid | proname | oid | proname
------+-------------+------+-------------
2931 | xpath_array | 2932 | xpath_array
(1 row)This is because you are calling xpath_array with 2 and 3 arguments.
Seems we don't do this anywhere else.I also had to add a #ifdef USE_LIBXML around xml_xmlnodetotext(). Please
research a fix to this an resubmit. Thanks.
OK.
I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.
--
Best regards,
Nikolay
On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:
I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.
Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().
--
Best regards,
Nikolay
Attachments:
xpath.w.namespaces.20070304.patchtext/x-patch; charset=ANSI_X3.4-1968; name=xpath.w.namespaces.20070304.patchDownload+306-1
What about it? W/o this not large patch XML functionality in 8.3 will be weak...
Will it be accepted?
On 3/5/07, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:
I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().
--
Best regards,
Nikolay
Attachments:
xpath.w.namespaces.20070304.patchtext/x-patch; name=xpath.w.namespaces.20070304.patchDownload+306-1
Nikolay Samokhvalov wrote:
What about it? W/o this not large patch XML functionality in 8.3 will
be weak...
Will it be accepted?
In principle I am in favor of the patch.
Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?
Perhaps even something in a special namespace?
cheers
andrew
On 3/17/07, Andrew Dunstan <andrew@dunslane.net> wrote:
In principle I am in favor of the patch.
Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?Perhaps even something in a special namespace?
I did think about it, but I didn't find any difficulties with simple
<x>...</x>. The thing is that regardless the element name we have
corresponding shift in XPath epression -- so, there cannot be any
problem from my point of view... But maybe I don't see something and
it's better to avoid _possible_ problem. It depends on PostgreSQL code
style itself -- what is the best approach in such cases? To avoid
unknown possible difficulties or to be clear?
--
Best regards,
Nikolay
Nikolay Samokhvalov wrote:
On 3/17/07, Andrew Dunstan <andrew@dunslane.net> wrote:
In principle I am in favor of the patch.
Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?Perhaps even something in a special namespace?
I did think about it, but I didn't find any difficulties with simple
<x>...</x>. The thing is that regardless the element name we have
corresponding shift in XPath epression -- so, there cannot be any
problem from my point of view... But maybe I don't see something and
it's better to avoid _possible_ problem. It depends on PostgreSQL code
style itself -- what is the best approach in such cases? To avoid
unknown possible difficulties or to be clear?
If you are sure that it won't cause a problem then I think it's ok to
leave it, as long as there is a comment in the code that says why we are
sure it's ok.
cheers
andrew
On 3/5/07, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:
I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().
Updated version of the patch contains bugfix: there were a problem
with path queries that pointed to elements (cases when a set of
document parts that correspond to subtrees should be returned).
Example is (included in regression test):
xmltest=# SELECT xmlpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
xmlpath
-------------------------
{<b>two</b>,<b>etc</b>}
(1 row)
Waiting for more feedback, please check it.
--
Best regards,
Nikolay
Attachments:
xpath.w.namespaces.20070318.patchtext/x-patch; charset=ANSI_X3.4-1968; name=xpath.w.namespaces.20070318.patchDownload+326-1
Patch applied.
Please provide a documentation addition. Thanks.
---------------------------------------------------------------------------
Nikolay Samokhvalov wrote:
On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:
I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().--
Best regards,
Nikolay
[ Attachment, skipping... ]
---------------------------(end of broadcast)---------------------------
TIP 3: Have you checked our extensive FAQ?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Applying newest version of this patch now; still needs documentation.
---------------------------------------------------------------------------
Nikolay Samokhvalov wrote:
On 3/5/07, Nikolay Samokhvalov <samokhvalov@gmail.com> wrote:
On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:
I'll fix these issues and extend the patch with resgression tests and
docs for xpath_array(). I'll resubmit it very soon.Here is a new version of the patch. I didn't change any part of docs yet.
Since there were no objections I've changed the name of the function
to xmlpath().Updated version of the patch contains bugfix: there were a problem
with path queries that pointed to elements (cases when a set of
document parts that correspond to subtrees should be returned).
Example is (included in regression test):xmltest=# SELECT xmlpath('//b', '<a>one <b>two</b> three <b>etc</b></a>');
xmlpath
-------------------------
{<b>two</b>,<b>etc</b>}
(1 row)Waiting for more feedback, please check it.
--
Best regards,
Nikolay
[ Attachment, skipping... ]
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://www.enterprisedb.com
+ If your life is a hard drive, Christ can be your backup. +
Nikolay Samokhvalov wrote:
Here is a new version of the patch. I didn't change any part of docs
yet. Since there were no objections I've changed the name of the
function to xmlpath().
I didn't see any discussion about changing the name to xmlpath. Seeing
that the function implements xpath, and xpath is a recognized name,
this change is wrong.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Bruce Momjian wrote:
Patch applied.
This code seems to think that if an xml datum starts with "<?xml" it's a
document. That is completely bogus.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Nikolay Samokhvalov wrote:
On 3/4/07, Nikolay Samokhvalov <nikolay@samokhvalov.com> wrote:
I'll fix these issues and extend the patch with resgression tests
and docs for xpath_array(). I'll resubmit it very soon.Here is a new version of the patch. I didn't change any part of docs
yet. Since there were no objections I've changed the name of the
function to xmlpath().
Why is the function not strict?
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Andrew Dunstan wrote:
Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?
Why do we even need to support xpath on fragments?
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Nikolay Samokhvalov wrote:
Also, maybe someone can suggest better approach for passing namespace
bindings (more convenient than ARRAY[ARRAY[...], ARRAY[...]])?
Your code assumes
ARRAY[ARRAY['myns', 'myns2'], ARRAY['http://example.com', 'http://example2.com']]
Shouldn't it be
ARRAY[ARRAY['myns', 'http://example.com'], ARRAY['myns2', 'http://example2.com']]
?
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
On 3/23/07, Peter Eisentraut <peter_e@gmx.net> wrote:
Andrew Dunstan wrote:
Would it be better to use some more unlikely name for the dummy root
element used to process fragments than <x> ?Why do we even need to support xpath on fragments?
Why not? I find it useful and convenient.
--
Best regards,
Nikolay
Am Mittwoch, 4. April 2007 14:42 schrieb Nikolay Samokhvalov:
Maybe it's worth to start keeping additional information in xml datum (i.e.
bit IS_DOCUMENT and, what is more important for xpath() function, a bit
indicating that XML value has only one root and can be considered as a tree
=> there is no need to wrap with <x> .. </x>). But this change requires
additional time to design xml datum structure and to rework the code
(macros, I/O functions...).
To determine if an XML datum is a document, call xml_is_document(). The
implementation of that function is probably not the best possible one, but
what the xpath() code does it totally wrong nevertheless.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Import Notes
Reply to msg id not found: e431ff4c0704040542g2b5bf772r603c106906c53010@mail.gmail.com
Am Mittwoch, 4. April 2007 14:42 schrieb Nikolay Samokhvalov:
Why is the function not strict?
Because in case of 3rd argument (NS mappings) being NULL, we shouldn't
return NULL immediately:
If the namespace mapping is NULL then it is unknown, and therefore the result
of the XPath expression cannot be evaluated with certainty. If no namespace
mapping is to be passed, then you should pass a list(/array/...) of length
zero.
--
Peter Eisentraut
http://developer.postgresql.org/~petere/
Import Notes
Reply to msg id not found: e431ff4c0704040542r7d2632b8ncbe89953dd0bd28a@mail.gmail.com